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UCS INTENSITY AND PERFORMANCE IN EYELID 
CONDITIONING * 


KENNETH W. SPENCE awp JOHN R. PLATT 


University of Texas 


Studies of the effects of UCS intensity on performance in human eyelid condi- 
tioning are reviewed, particularly with respect to a recent claim that this 
variable determines the proportion of Ss who condition and not the performance 
level of individual Ss. The evidence reviewed clearly indicates that UCS intensity 
does affect the level of performance in Ss who condition. The function involved 
appears to be negatively accelerated and to approach an asymptote within a 
relatively small range of intensity values, These effects of UCS intensity are 
interpreted as reflecting both motivational (D) and associative (H) factors. 


A recent study by Burstein (1965) has 
raised the question as to whether variation of 
the intensity of the UCS affects performance 
level in classical eyelid conditioning. Employ- 
ing two different intensities of air puff (.75 
and 5.0 psi), Burstein found in the first phasé 
of his experiment (50 trials) that when the 
data of all the subjects (Ss) were employed, 
the group conditioned with the stronger puff 
gave a significantly higher frequency of CRs 
than that which had the weaker puff. How- 
ever, when the responses of nonconditioners, 
defined as Ss who averaged less than 10% 
CRs, were removed from the data, the effect 
of UCS intensity was no longer significant. 
In a second phase of the experiment, which 
involved 60 further trials, a similar result was 
obtained; removal of the nonconditioners 
changed the finding from being significant to 
nonsignificant, 

On the basis of these findings and a re- 
analysis of the data of an earlier study by 


1 This paper is part of a project concerned with 
the influence of motivation on performance in condi- 
tioning and learning. Much of the research cited was 
conducted under Contract NOnr-1509(18) between 
the State University of Iowa and the Office of Naval 
Research, Preparation for publication was supported 
by Contract NOnr-375(18) between the University 
of Texas and the Office of Naval Research. Acknow!- 
edgment is made to Abigail B. Capaldi who assisted 
in the analyses of data. 


Passey (1948) which also showed no signifi- 
cant effect of UCS intensity when noncondi- 
tioners were removed from consideration, 
Burstein concluded that this variable is not 
a determinant of the performance level of S 
in such conditioning. Instead, he suggested 
that UCS intensity determines only whether 
an § will or will not condition, To quote Bur- 
stein directly: “This view holds that with 
lower UCS intensities fewer Ss condition than 
is the case with higher UCS intensities and 
that differences reported in previous studies 
are the artifactual result of averaging group 
data [p. 303].” 

This interpretation of these studies is some- 
what different, even quite counter in some 
respects, to that offered by Hull (1952) 
and the senior author (Spence, 1956; 1960). 
According to the latter formulation, evidence 
that group performance levels in such experi- 
ments vary with UCS intensity has been 
interpreted as implying that response fre- 
quency (R,), or its immediate theoretical 
determinant, excitatory potential (Æ), is a 
function of UCS intensity (Sa). On the basis 
of earlier studies (cf. Passey, 1948; Spence, 
1953, 1956; Spence & Taylor, 1951) it was 
assumed that variations of UCS intensity af- 
fect the strength of Æ by changing the drive 
level (D) of the subject. The results of 
subsequent experiments (Ross & Hunter, 
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Fic. 1. Empirical functions relating conditioning 
performance level to UCS intensity in a ready signal 
situation for all Ss and for the middle half selected 


on the basis of total CRs. 


1959; Spence, Haggard, & Ross, 1958a, 1958b; 
Trapold & Spence, 1960) „employing a de- 
sign in which groups of subjects were equated 
for drive level (D), but differed in the 
strength of UCS administered on condition- 
ing trials, led to the further inference that 
the growth of habit strength (H ) is also a 
function of UCS intensity. This theoretical 
formulation can be summarized as follows °: 


R, = f(E) [1] 
E=HXD [2] 
H=£(Su) [3] 
D = f(Sa) [4] 


In his brief reference to past research 
studies, Burstein appears not to be entirely 
clear as to the implications of the findings of 
the different types of studies we have con- 
ducted. For example, he cites two studies 
from our laboratory as confirming Passey’s 
finding that differences in UCS intensity lead 
to differences in performance level in eyelid 
conditioning (Ross & Hunter, 1959; Spence, 


2 Details of the portion of the theory between the 
intervening variable Æ and the empirical response 
measure, Rp, involving such theoretical concepts as 
oscillatory inhibition (Jo) and response threshold 
(L) have been omitted. It is sufficient to state that 
the assumptions made with regard to these other 
intervening variables lead to the implication that 


Rp isa normal integral function of the superthresh- 


old value of E. 


Haggard, & Ross, 1958a). In the first of i 
two studies, the intensity of the ung 
tioned stimuli was the same for both gr 
while in the second study, the main g 
compared in two separate experiments ag 
did not differ in the average strength of 
air puffs that were used. Thus performat 
in these experiments differed in spite of | 
fact that there was no difference in the 
tensities of the puffs employed. The differen 
depended upon whether the stronger” 
weaker of the two puffs used was paired 
the CS. 

In order to clarify these and other aspet 
of this problem of UCS intensity in classi 
conditioning, the present report not only 
views the findings of these past studies, b 
also attempts to ascertain by various kil 
of new analyses of them what bearing t 
have on these alternative theoretical i 
pretations. 


EMPIRICAL FUNCTION RELATING 
PERFORMANCE AND UCS 
INTENSITY 


_ In considering the problem of whether I 
intensity affects performance, it is impo 
to keep in mind the nature of the empi 
function relating performance to UCS 
tensity values: Rp = f(S,). Two of our stu 
(Ross & Spence, 1960; Spence, 1958) ii 
presented data which reveal that this relal 
is a negatively accelerated one that 
proaches an asymptote within a relati 
small range of puff intensity values. Figu 
presents a further summary of the find 
of this function. The solid curve in this 
represents mean percentage of CRs, equal 
weighted for male and female Ss, made 
Trials 61-80 in six different experiments ( 
at 2.0 psi) that have been conducted 
more or less the same conditions. 
studies, which were conducted for oth 
poses, have all involved a ready signal, @ 
as the CS, and a CS-UCS interval of app 
mately 500 msec. (Reynolds, 1958; M 
1959; Rundquist & Ross, 1958; Runde 
& Spence, 1959; Spence, 1958; Spen¢ 
Ross, 1959). 

The broken curve joining the open cif 
was obtained by taking the middle 507 
the Ss in these experiments, again treat! 


g] 
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TABLE 1 


Errects or UCS INTENSITY ON CONDITIONING PERFORMANCE WITH AND 
WITHOUT NONCONDITIONERS ELIMINATED 


HIL ne Percent CRS 
experiment ond Trials pected $ 
Strong Weak 
1. Spence-Taylor (1951) 6 vs.2.0 51-100 
All Ss (N = 100) 63.4 53.6 9.8 ns 
Conditioners only (V = 91) 65.9 619 4.0 ns 
2. Spence (1958) 25 vs. 1.5 41-80 
All Ss (N = 120) 59.2 37.7 21.5 <.001 
Conditioners only (N = 99) 62.1 50.5 11.6 <.05 
3. Spence-Weyant (1960) .25 vs. 2.0 51-100 
All Ss (N = 72) 68.9 44.9 24.0 <.001 
Conditioners only (N = 62) 74.8 54.9 19.9 <.001 
4, Spence (1953) .25 vs. 5.0 51-100 
All Ss (N = 40) 73.5 40.0 33.5 <.001 
Conditioners only (N = 36) 73.5 49.1 24.4 <.005 
5. Spence-Haggard-Ross (1958a) .33 vs. 2.0 31-50 
All Ss (V = 100) 58.4 22.4 36.0 <.001 
Conditioners only (N = 73) 66.3 37.8 28.5 <.005 
6. Rundquist-Spence-Stubbs (1958)| .33 vs. 2.0 | 41-60 
All Ss (N = 60) 60.9 43.5 17.4 <.05 
Conditioners only (V = 56) 67.2 45.0 22.2 <.01 


the data from men and women separately and 
giving them equal weight. This curve is not 
only free from the influence of inhibitors, that 
is, Ss who never give a CR, but also tends 
to lessen the depressing effect at higher values 
resulting from the fact that the frequency 
measure has a ceiling, that is, 100%. That 
is, once performance reaches 100% this mea- 
sure is not able to reflect any further increase 
that the response tendency may develop. 
Elimination of the top 25% of the Ss removed 
most of those that had reached the ceiling 
earlier in training. 

It is clearly evident from these curves that 
the major change in the response measure 
occurs in the range of intensity values less 
than 1 psi. It is also apparent that the find- 
ings of experiments that compare the per- 
formance of groups conditioned under differ- 
ent puff intensities will depend upon the 
values of UCS intensity that are employed. 
The use of intensity values at points over 
which the change in response strength is 
negligible is likely to lead to the erroneous 
conclusion that response strength does not 
ary with puff intensity. Obviously, the higher 
the intensity of the weaker UCS employed 
and/or the smaller the difference in the in- 


tensities compared, the less is the likelihood 
of obtaining a significant difference. Unfor- 
tunately, as we shall see, Burstein’s own 
study employed puff intensities that would 
be expected, on the basis of the data in 
Figure 1, to produce a relatively small dif- 
ference in CR frequency, while three of 
Passey’s four groups were conditioned with 
intensity values that are in the region at 
which response strength is asymptotic. 


Studies Involving Differences in UCS In- 
tensity 


The primary basis underlying Burstein’s 
conclusion that UCS intensity is not a de- 
terminant of performance level in eyelid con- 
ditioning, but rather determines whether or 
not an § conditions, is the finding that with 
the removal of nonconditioners from his 
data, the effect of UCS intensity was no 
longer significant. Leaving aside for the 


3 Passey reported using puff intensities of 7.5, 18.0, 
44.0 and 88.0 psi. Since puffs of such high intensity 
would undoubtedly produce damage to the eye, it 
is possible that Passey misplaced a decimal point in 
converting from millimeters of mercury to psi and 
that the actual values employed may have been .75, 
1.8, 4.4 and 8.8 psi. 
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moment the question of the appropriateness 
of comparing the performances of the remain- 
ing Ss, we have examined the data of a num- 
ber of our studies that have compared groups 
conditioned with different puff intensities. 
Table 1 presents the results of this analysis. 

The first column of the table gives the 
number of Ss involved in each study as well 
as the number remaining after noncondition- 
ers were eliminated. In these analyses, any S 
who made less than 10% CRs in the trial 
blocks being analyzed was classified as a non- 
conditioner, The second column displays the 
puff intensities which were paired with the 
CS in the several groups, and the third indi- 
cates the trials on which the analyses were 
based. In each study except the last two, the 
last half of the conditioning trials were ana- 
lyzed. Column 4 presents the mean percent 
CRs for the various groups, while Column 5 
gives the differences between these means and 
Column 6 the corresponding significance 
levels, All the studies in the table involved a 
ready signal except the third. The CS-UCS 
intervals involved were 520 msec. in the first 
study, 755 msec. in the fourth, and 500 
msec. in the remainder. 

The first three studies employed Ss selected 
on the basis of high and low scores (highest 
and lowest pentiles) on the Manifest Anxiety 
Scale (Taylor, 1953). These studies were 
analyzed in 2 X 2 analysis of variance with 
the main effects being Anxiety and UCS In- 
tensity, In each case, regardless of whether 
nonconditioners were eliminated, the Anxiety 
X UCS interaction was not significant. Thus, 
only the significance levels for the UCS main 
effects are shown in Table 1. As can be seen, 
the results of these three studies support our 
interpretation that performance level in- 
creases with UCS intensity although the 
magnitude of the effect is inversely related 
to the strength of the weak puff (Study 1 
vs. 2 and 3) and the difference between 
the strengths of the puffs (Study 2 vs. 3). 
The present data also indicate that these rela- 
tionships hold in spite of the removal of non- 
conditioners. 

The fourth study obtained a highly signifi- 
cant effect for UCS intensity among randomly 
selected Ss, regardless of whether noncondi- 
tioners were included. The fifth study re- 


quires special comment in that the stro 
UCS group received the strong puff paire 
with the CS and an equal number of we 
puffs without the CS. The weak UCS gro 
received the weak puff on both CS and no 
CS trials. Again a highly significant effect 
UCS intensity was obtained. The last stu 
in Table 1 also requires comment in that 
data represent CRs to the positive stimul 
in a differential conditioning experiment, 
before, UCS intensity is significant, regardli 
of the elimination of nonconditioners. In t 
case, however, the elimination procedure ae 
ally increased the difference between 
group means as well as its significance ley 

Several studies from outside the Iowa li 
oratory also require comment in this sect 
Burstein (1965) reported that after elim 
tion of Ss who made no CRs irora the | 
of Passey (1948), the effect of UCS inten 
was not significant. While |: = is reason 
confusion as to what intensitic: Vassey ac 
ally employed, at least three of his four 
were probably greater than 1.0 psi and 
would not have been expected to produ 
different performance levels on the basis 
the empirical functions presented in th 
vious section. This proliferation of equi 
groups, together with the small sample si 
involved, probably accounts for Burstei 
failure to obtain significance in his reanaly: 
of this data. Support is given to this poss 
ity by the fact that an analysis of vari 
between Passey’s two extreme groups ¥ 
zero-responders eliminated shows a signi 
effect of UCS intensity (F = 6.44; df = 
$ < .025). 

A second study (Gormezano, Moore, 
Deaux; 1962) which Burstein cites as fa 
to yield a significant UCS intensity effect 
actually irrelevant to the question of w. 
this variable determines performance leve 
classical eyelid conditioning since it invol 
the avoidance-classical, yoked-comparis@t 
technique. As Gormezano et al. clearly i 
cated, a significant Classical-Avoida 
UCS Intensity interaction was obtained 
result of the negative correlation betwee 
performance levels of the paired Ss, Th 
as UCS intensity increased, the perfort 
level of avoidance Ss increased so that t 
yoked classical controls received fewer 
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forcements, The effect of this partial rein- 
forcement was sufficiently strong to produce 
an inverse ordering of the classical groups on 
the UCS intensity dimension. Thus, this 
technique completely confounds UCS intensity 
with partial reinforcement. 

Two additional studies (Beck, 1963; 
Walker, 1960) should be mentioned, although 
we do not have the data necessary for a re- 
analysis with nonconditioners eliminated. 
Both studies compared a .5 with a 5.0 psi 
puff and found highly significant effects for 
this variable over the entire conditioning 
period (Trials 1-80). The Beck study yielded 
an F of 43.55 (df=1,48; p< .001) and 
Walker found an F of 77.62 (df = 1,152; 
p < .001). With effects of this magnitude, it 
is very doubtful that removal of noncondi- 
tioners would eliminate the significant effect 
of UCS intensity. 

In summary, the results reviewed in this 
section show very clearly that higher UCS 
intensity values produced higher levels of 
conditioning performance, over and above any 
effect this variable might have on the propor- 
tion of Ss who show any conditioning at all. 
Studies that failed to obtain this effect em- 
ployed a combination of UCS intensity values 
which were too far out on the negatively ac- 
celerated function relating performance level 
to UCS intensity to have been expected to 
produce a significant difference without a very 
large number of subjects. 


y 


Studies Involving Different Reinforcing Stim- 
uli with Puff Intensities Equated. 


The studies discussed in the previous sec- 
tion involved comparison of the performances 
of groups in which the UCS differed in in- 
tensity. In addition to this type of experi- 
ment, the senior author has conducted a series 
of investigations in which the groups compared 
were equated in terms of the intensities of 
the UCS employed. The primary objective of 
these latter studies was to equalize the drive 
level of the Ss, but vary the amount of rein- 
forcement. Drive level was equated by em- 
ploying two different intensities of the UCS 
with each being present on half of each 
group’s trials. The high-reinforcement group 
(H) received the strong UCS (2.0 psi puff) 
on the conditioning, that is, paired CS-UCS 


N » 50 per group 


ý A 


60 ren Gp.H 


PER CENT CONDITIONED RESPONSES 


1-10 21-30 31-40 41-50 
NUMBER OF CONDITIONING TRIALS 


11-20 


Fic. 2. Performance curves for groups differing in 
reinforcing and/or unpaired UCS intensity. (From 
Spence, Haggard, and Ross, 1958a.) 


trials and the weak UCS (.33 psi puff) on 
nonconditioning trials, which in the case of 
the particular studies to be discussed have 
involved presenting the UCS alone. The re- 
verse relations held for the low-reinforcement 
group (L), the weak puff being used on con- 
ditioning trials and the strong puff on the 
nonconditioning (UCS alone) trials. 

Typical results of such experiments are 
shown in Figure 2. As may be seen, the curve 
for Group H rises well above that for Group 
L. Since the level of D, defined in terms of 
the average intensities of the puffs employed, 
was equated for these two groups, the signifi- 
cant difference in their performance levels 
must reflect some factor other than D. In 
terms of our theory as outlined in the first 
section, this factor is habit strength (H), 
which is related to the different intensities 
of the puff strengths on the paired, condi- 
tioning trials. An extension of this study that 
employed a longer acquisition period (100 
trials) indicated that it was the asymptote 
and not the rate-of-approach parameter of 
habit strength that varied with the intensity 
of the reinforcing UCS (Ross & Hunter, 
1959), 

Returning to Figure 2, Group LL received 
the weak puff on all trials. The difference 
between its performance and that of Group L, 
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which was significant, presumably reflects a task (Spence, Homzie, & Rutledge, 196 
difference in the levels of D only, since these All of the analyses involved the last 20 
two groups received different mean puff in- of the acquisition period. In the case of 
tensities, but UCS intensity was equated on Experiment 4, which was a differential condi 
the conditioning (ie., H-producing) trials. tioning study, the responses analyzed 
The difference between Groups H and LL, those made to the positive CS. 
which was also significant, is interpreted by As may be seen, all six studies gave a si 
the present theory as reflecting differences in nificant difference after the nonconditioner 
both habit strength and drive level. The again defined in terms of failure to give 10% 
experiments in the previous section would CRs in the given period of acquisition, wer 
also reflect differences in both H and D, as removed. It is clearly evident that the dif 
not only were the intensities of the air puff ferences in performance between the group 
different, but they differed on conditioning in these studies are not a function of differen 
(H-producing) trials. numbers of Ss who become conditioned am 
The question can also be raised as to who then respond (on the average) at th 
whether the group differences obtained in same frequency level. While some Ss do n 
these experiments with different reinforcing become conditioned, those who do so respon 
UCS intensities merely reflect differences in at different frequency levels depending on th 
the proportion of Ss who do and do not intensity of the paired (reinforcing) UCS., 
condition as Burstein has suggested. As in Not included in Table 2 is the comparis 
the case of the first set of studies, Table 2 between Groups L and LL in the experime 
shows the effect of removing the noncondi- described at the beginning of this section. | 
tioners from the data. The first four studies the case of these groups, elimination of tl 
were conducted in the standard conditioning nonconditioners again did not result in th 
situation with the ready signal being used. difference being nonsignificant. Indeed th 
The last two employed a masking learning difference and F were somewhat larger afte 


TABLE 2 


EFFECTS or REINFORCING UCS INTENSITY ON CONDITIONING PERFORMANCE 
WITH AND WITHOUT NONCONDITIONERS ELIMINATED 


1. Spence-Haggard-Ross (1958a) 
All Ss (N = 100) 
Conditioners only (N = 77) 

2. Ross-Hunter (1959) 

All Ss (V = 66) 
Conditioners only (N = 56) 

3. Trapold-Spence (1960) 

All Ss (N = 55) 
Conditioners only (V = 50) 

4, Spence-Tandler (1963) 

All Ss (V = 80) 
Conditioners only (N = 77) 

5. Homzie-Weiss (1965) 

All Ss (N = 60) 
Conditioners only (N = 56) 

6. Spence (unpublished) 

All Ss (V = 160) 

Conditioners only (N = 151) 


-33 vs. 2.0 
Lvs. H 58.4 35.6 22.8 

33 vs. 2.0 | 81-100 

33 vs. 2.0 | 71-90 

33 vs. 2.0 | 41-60 


33 vs. 2.0 | 31-50 


33 vs. 2.0 | 21-40 


a Performed at the State University of Iowa, 1962. 
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the removal. In this instance, the different 
performance of the two groups cannot be 
related to the reinforcing UCS, which was 
equated, but must be a function of the in- 
tensity of the UCS that was given alone on 
a random half of the trials. 

The findings of this section, like those of the 
previous section, again show that removal of 
the data of Ss who do not condition does not 
eliminate the difference in performance be- 
tween groups of Ss conditioned with different 
paired puff intensities. But even if this 
method of treating the data had led to non- 
significant differences in the resultant groups, 
there is reason to question the admissibility 
of comparing groups in which unequal num- 
bers of Ss have been eliminated on the basis 
of some level of performance criterion (e.g., 
10% CRs). A major difficulty with this pro- 
cedure is that, assuming the distribution of 
conditionability of Ss (i.e., their capacity to 
be conditioned) is the same in the original 
groups, it places one in the position of com- 
paring subsamples in which this is no longer 
the case. That is, more Ss from the low end 
of the distribution of conditionability are 
removed from one group (usually the low- 
intensity group) than the other. The subse- 
quent failure to find performance differences 
between the resultant groups is all but im- 
possible to interpret. It could, for example, 
merely reflect the fact that the higher aver- 
age level of conditionability of the group of 
Ss that had the weak puff tends to com- 
pensate for the higher drive level of the 
Strong-puff group from which fewer of the 
Poorest conditioners have been removed. 

A more appropriate procedure would be to 
remove the same number of Ss from the low 
end of each group, this number being that 
necessary to eliminate all Ss from the weaker- 
puff group that do not meet the criterion of 
10% CRs. The resulting group would thus 
Consist only of Ss who had conditioned and 
Were also comparable in conditioning ability. 
As might be expected, this procedure led to 
differences between the high- and low-intensity 
stoups that were highly significant in all 
Instances including the single study (Spence 


and Taylor) that was not significant in 
Table 1, 


ÅDDITIONAL STUDIES OF THE EFFECTS or 
Varyinc UCS Intensity 


Factorial Design Experiment: Burstein, 1965 


Turning now to Burstein’s own study, he 
attempted to avoid the ceiling problem in- 
herent in the frequency measure by employ- 
ing a CS-UCS interval that resulted in a 
much lower level of conditioning perform- 
ance. He gave 56 Ss 50 conditioning trials 
with a 1000 msec. CS-UCS interval. Half of 
the Ss received a .75 psi UCS and the other 
half, a 5.5 psi one. Then half of each group 
was given 60 additional trials with the same 
UCS as in the first phase, while the remaining 
half was given the same number of trials 
with the other instensity value. Analysis of 
variance of total CRs in the preshift period 
yielded a significant UCS intensity effect 
(F =4.12; df= 1,54; p< .05); however, 
this F was reduced to less than 1.0 when the 
data of Ss making less than 10% CRs in this 
period were eliminated. A 2 X 2 analysis of 
variance of total postshift CRs showed sig- 
nificant effects of both preshift (F = 4.64; 
df = 1,52; p < .05) and postshift (F = 4.66; 
df = 1,52; p < .05) UCS intensity. Both of 
these Fs were also reduced to less than 1.0 
when the data of the preshift nonconditioners 
were excluded. 

In view of the UCS values (.75 and 5.5 
psi) employed, it is not surprising that the 
difference between the two groups, even with 
all Ss included, was barely significant. In 
terms of the authors’ theory, which also as- 
sumes that the asymptote of habit strength 
(H) is a function of the CS-UCS interval, 
it is not unexpected that a relatively large 
number of the Ss trained with the weaker 
puff did not meet the conditioning criterion. 
The combination of a very low H value and 
the medium value of D would be expected to 
produce a distribution of Æ values, some pro- 
portion of which would be below the thresh- 
old value of E (L) necessary for a response 
to occur. 

Estimates based on Burstein’s published 
curves indicate that approximately half of 
the Ss (13 ?) trained with the weaker UCS 
did not condition, whereas a much lesser 
number (the authors estimated 4) failed to 
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Fic. 3. Performance curves for a factorial UCS 
intensity-shift experiment with and without the data 
of “nonconditioners.” (From Spence, 1953.) 


meet the conditioning criterion in the strong- 
puff group. Comparison of the performances 
of the remaining Ss is highly confounded by 
this difference in conditionability. 


Factorial Design Experiment: Spence, 1953 


Contrasting sharply with the findings of 
Burstein are the results of the early study 
of Spence (1953) which was of the same 
factorial design. This experiment employed a 
more optimal UCS interval (755 msec.) and 
the puff intensities (.25 and 5.00 psi) were 
such as would theoretically produce a much 
greater difference in drive strength and hence 
performance. The upper graph in Figure 3 
gives the performance curves for all Ss, The 
differences between the groups that had the 
strong and weak puffs on the preshift (Day 


1) and postshift trials (Day 2) were high 
significant ( < .001). Figure 3 presents th 
performance curves for the groups after 
Burstein’s procedure of eliminating the dal 
of Ss who made less than 10% CRs in tJ 
preshift period. The differences were si 
highly significant: that for the preshift perio 
at about the .01 level, while that for th 
postshift period remained at the .001 levé 
Under these more optimal conditions for pr 
ducing a difference, even Burstein’s procedu 
of eliminating more poor conditioners fro 
the weak- than the strong-puff group 
unable to offset the superior performance 
the Ss that had the strong puff. As we hai 
seen, essentially the same finding was 0 
tained in all but one of the studies in Tab 
1 and 2. 


Experiment on the Effects of Reducing UC 
Intensity: Trapold and Spence, 1960 


In concluding this discussion, attentio 
should be directed to a finding alread 
in the literature that should have pi 
cluded Burstein’s interpretation. In a stud 
by Trapold and Spence (1960), 40 Ss w 
given 90 conditioning trials with a 2.0 p 
UCS, interspersed with an equal number 6 
trials on which a .33 psi puff was presente 
alone, that is, without the CS. Following thi 
training, half of the Ss (Group S-S) wer 
given 40 additional conditioning trials (8 
puff presentations) on which the intensi 
of the paired and unpaired puffs were inter 
changed. The remainder of the Ss (Growl 
S-W) received an equal number of addition 
trials, but both the paired and unpaired pi 
were .33 psi. 

Particularly relevant to the issue at hani 
are the findings with regard to group S-V 
in which, it will be noted, UCS intensity wā 
reduced. According to Burstein, this reductio 
should have had no effect on the performane 
level of this group. On the other hand, ov 
theory would predict a decrease, for the re 
duction in UCS intensity would lead to | 
lowering of D, and hence Z, 

The results were very clear cut: Grou 
S-W underwent a very marked and hight 
significant (p < .01) decrease in performant 
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level.* This result, moreover, cannot be 
ascribed in any way to nonconditioners. Such 
Ss, if present, would have been responding 
at a very low level prior to the shift and thus 
could not have been responsible for the de- 
crease from 55% to 27% CRs in performance 
level. This finding is not only opposed to the 
interpretation of Burstein, but also clearly 
shows that variation of the intensity of the 
UCS does determine the level of performance 
of Ss who have conditioned. 

While not so relevant to the present issue, 
the results for Group S-S are of importance 
to our theory. Following the shift, the per- 
formance level of this group did not decrease 
significantly. This finding is in line with the 
lack of change in drive level (D) of this 
group. It also indicates that a decrease in the 
paired (reinforcing) UCS intensity in the 
postshift period did not diminish the existent 
habit strength (H)—a finding which con- 
firms an earlier assumption of Hull (1943). 


CONCLUSIONS 


The evidence, both old and new, presented 
in this report overwhelmingly refutes the 
claim of Burstein that UCS intensity does not 
affect performance level in human eyelid con- 
ditioning, but determines only whether or 
not an S conditions, Rather, the data suggest 
that UCS intensity affects performance both 
through determining the learning or condi- 
tioning factor (H), and a motivation or 
drive factor (D). The findings also show that 
a weak UCS intensity, when combined with 
conditions unfavorable to learning, that is, 
growth of H, will lead to failure to condition. 
But among Ss who do condition, performance 


level is a function of the intensity of UCS | 


employed. 


* After 20 trials Group S-W dropped to a level of 
Performance that was somewhat lower even than 
that attained by a third control group conditioned 
with the weak UCS. This latter decrement presum- 
ably reflects in part an adaptation or contrast effect 
of being shifted from a strong to a weak puff and 
is in part due to generalization. 
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AUTHORITARIANISM SCALES AND RESPONSE BIAS* 
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Scales of authoritarianism differ from other self-report measures like the MMPI 
in ways that make them particularly susceptible to agreement response bias: 
(a) depending particularly on item content in relation to a theoretical syn- 
drome; (b) using ambiguous items; (c) scoring all items in 1 direction. Fit- 
tingly, evidence now indicates that response bias is a major factor on authori- 
tarian scales and not on the MMPI. This evidence can be maintained against 
the counterproposals of Rokeach and Samelson. Support is reiterated for in- 
terpreting (a) response bias not mechanically but as a response tendency when 
the subject is uncertain; (b) high authoritarianism scale scores as representing 
simple-mindedness more than authoritarian ideologies. The latter interpretation 
is supported not only for college students but even more from survey data for 


the general population. 


Scales derived from The Authoritarian Per- 
sonality ? (Adorno, Frenkel-Brunswik, Levin- 
son, & Sanford, 1950) are different from other 
self-report measures of personality and atti- 
tudes. The differences make the authoritarian- 
ism scales especially susceptible to represent- 
ing agreement response tendencies rather 


1This investigation was supported in part by 
Public Health Service Research Grant No. MH- 
06094 from the National Institutes of Mental Health. 
I am indebted to L. R. Goldberg, L. G. Rorer and 
J. Block for discussions of the issues, and to L. R. 
Goldberg for criticism of the manuscript. 

2 The specific scales dealt with in earlier research 
(Peabody, 1961), and discussed here include the F 
scale of authoritarianism (Adorno et al., 1950), the 
Dogmatism scale of Rokeach (e.g, 1956) and the 
Anti-Semitism scale (Adorno et al., 1950). For con- 
venience, the term “authoritarianism” will frequently 
be extended to describe all three scales. Numerous 
other scales of the same type were inspired by The 
Authoritarian Personality. The authors themselves 
contributed further scales of Ethnocentrism, Reli- 
gious Conservatism, and Traditional Family Ideol- 
ogy. An interesting later descendent is the Conserva- 
tism scale of McClosky (1958), who worded state- 
ments to express abstract principles of conservative 
ideology. 

_An exception is the Politico-Economic Conserva- 
tism (PEC) scale, developed in The Authoritarian 
Personality but not of the typical format since it 
included throughout a substantial proportion of items 
Scored in the reverse direction. Accordingly, discus- 
Sion below of scales of the authoritarianism type 
does not include the PEC scale, which presumably 
measures primarily some kind of content. (Note, 
however, the doubts raised by Hyman and Sheats- 
ley, 1954, pp. 73-74, as to whether what is meas- 
By politico-economic conservatism in the usual 
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than authoritarian content. Such tendencies 
were termed “response sets” by Cronbach 
(1946, 1950); but more recently Rorer (1965) 
proposes calling them “response styles,” and 
using “response set” for other tendencies such 
as those involving faking and social desir- 
ability. To avoid terminological controversy, 
this paper will use the term “agreement re- 
sponse bias.” 

The present paper will argue that: 

1. In comparison to other self-report meas- 
ures like the MMPI, it is to be expected on 
general grounds that the authoritarianism 
scales should be particularly susceptible to 
response bias, since: (a) they are particularly 
dependent on a complex relation of item con- 
tent to a theoretical syndrome; (b) they de- 
liberately use ambiguous items, making agree- 
ment bias likely on the separate items; (c) 
they score all items in one direction, permit- 
ting agreement bias to summate systemati- 
cally over the scale as a whole. 

2. The best available evidence indicates 
that agreement bias is not a major factor in 
the MMPI, but is a major factor on the 
authoritarianism scales. This conclusion may 
be sustained despite the counterproposals of 
Rokeach (1963) and Samelson (1964). 

The distinctive features of the authoritar- 
ianism scales need to be reemphasized at this 
time, since there has been a tendency to 
treat these scales as comparable to other 
measures like the MMPI and to consign them 
all to a common fate, for worse or for better. 
In the late 1950s, evidence for response bias 
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in the authoritarianism scales undoubtedly 
encouraged the hypothesis that it might also 
play a major role in the MMPI (e.g., Jackson 
& Messick, 1958). More recently, with the 
appearance of strong evidence that response 
bias is not a major factor in the MMPI scales, 
there is a tendency to assume that the authori- 
tarianism scales are also purged of suspicion 
(e.g., Rorer, 1965). It will be shown here 
that there is good reason to accept the evi- 
dence both that response bias is a major fac- 
tor on the authoritarianism scales and that it 
is not on scales from the MMPI, 


GENERAL CONSIDERATIONS 
Dependence on Item Content 


The basis of selecting items for a typical 
MMPI scale is that they in fact distinguish 
some criterion group: an item may be se- 
lected if it distinguishes a particular group 
(e.g., diagnosed as having paranoia), regard- 
less of whether or not its content has any 
apparent relation to the characteristics of the 
pathological syndrome. In contrast, the au- 
thoritarianism items were not selected because 
they somehow or other distinguished right- 
wing criterion groups. Instead, these scales 
are heavily dependent on item content: 

1. Traditional Likert scale technique uses 
some empirical test for internal consistency, 
but has no standard method of specifying 
what the scale measures—using neither cri- 
terion groups nor Thurstone-type judges. In 
effect, it is assumed that there is a direct cor- 
respondence between the content of the items 
and the subject’s attitudes or personality, 
with the overall content representing some 
obvious dimension. 

2. Unlike most Likert scales, however, the 
content of a typical authoritarianism scale is 
not supposed to represent a single dimension, 
but the different dimensions of extensive and 
often very heterogeneous theoretical syn- 
dromes. For example, the F scale is supposed 
to represent an authoritarianism syndrome 
combining the nine theoretical characteristics 
(further defined in The Authoritarian Per- 
sonality) of Conventionalism, Authoritarian 
submission, Authoritarian aggression, Anti- 
intraception, Superstition and stereotypy, 

Power and “toughness,” Destruction and 
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cynicism, Projectivity, and Sex. This syn- 
drome is supported by the internal consistency 
of the scale only if the content of the items 
relates to these different theoretical charac- 
teristics, One consequence is that the syn- 
drome is destroyed by the frequent proposal 
that evidence for agreement bias in authori- 
tarianism scales may itself be interpreted as 
an authoritarian characteristic, Although this 
proposal is basically circular, its acceptance 
would still mean that the scale no longer 
measures the various different characteristics 
of the syndrome, but only the single charac- 
teristic of some kind of authoritarian submis- 
sion to “authoritative” statements. 

3. Although the relation of item content to 
the several theoretical characteristics is cru- 
cial, the effort—in contrast to traditional 
Likert scales—was not to make the relation as 
direct and straightforward as possible. On 
the contrary: “A second rule of item formu- 
lation was that each item should achieve a 
proper balance between irrationality and ob- 
jective truth. Each item had to have some 
degree of rational appeal, but it had to be 
formulated in such a way that the rational 
aspect was not the major factor making for 
agreement or disagreement. This in many 
cases was a highly subtle matter [Adorno et 
al., 1950, p. 241].” Thus, there is assumed 
only a partial correspondence between the 
content of an item and some theoretical char- 
acteristic of authoritarianism (the “irrational” 
aspect of the item), together with faith that 
subjects are responding on the basis of this 
partial aspect, although presumably without 
intending to. 

A clear example of dependence on the rela- 
tion of item content to a theoretical syndrome 
is given by the Conservatism scale of Mc- 
Closky (1958). Forty-three (later reduced to 
12) statements were worded to express seven 
abstract principles of conservative ideology, 
with agreement always scored as conservative. 
The content of the scale was independently 
judged by advanced graduate students in 
political theory as representing a conserva- 
tive ideology. However, when applied to the 
general public, the scale did not relate to 
criteria that would usually be considered con- 
servative. The correlations between the scale 
and such factors as party affiliation, support 
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of candidates, attitudes on economic issues, 
stands on public issues, liberal—conservative 
self-designation “tend to be fairly low [p.44].” 
Ambiguity 

“Above all . . . each statement must avoid 
any kind of ambiguity .. . double barreled 
statements are most confusing and should 
always be broken in two [Likert, 1932, p. 
45).” 

“To cover a great variety of ideas as effi- 
ciently as possible, two or more of them were 
combined in the same statement [Adorno et 
al., 1950, p. 251].” 

Although any single item necessarily con- 
founds content and response bias, response 
bias is likely to be important only to the ex- 
tent that the item is ambiguous, so that the 
subject is uncertain as to how to respond on 
the basis of content (Cronbach, 1946; 1950). 
Constructors of self-report measures like the 
MMPI normally try to follow the policy rec- 
ommended by Likert and make the items as 
unambiguous as possible. An opposite policy 
was followed in The Authoritarian Person- 
ality: each item was intended to have both 
rational and irrational aspects, and to com- 
bine several different ideas. This policy was 
Successful in producing ambiguous items, as 
is generally recognized by those who have ex- 
amined them closely. A possible exception 
might be Rokeach (1963, p. 307) who de- 
mands evidence that authoritarianism items 
are relatively ambiguous. 

The differences are reflected even in simple 
Measures like sentence length, which are pre- 
sented in Table 1. Since distributions of sen- 
tence length tend to be skewed, the ordinal 
Measures are probably preferable. By any 
relevant test, the authoritarianism items are 
much longer than those of the MMPI. 

_ It should be noted that the Anti-Semitism 
items resemble the other authoritarianism 
items, despite the widespread belief that since 
Prejudice can be a relatively unambiguous 
topic, the prejudice scales of The Authori- 
tarian Personality must be unambiguous. The 
specific content of the statements is highly 
ambiguous, as suggested both by the measures 
of Table 1 and an examination of the items, 
Which discloses a high proportion of multi- 
barreled statements. As an indication that 


TABLE 1 


Noumper or Worns PER SENTENCE 
IN DIFFERENT MEASURES 


Authoritarianism scales 


MMPI 
Anti- 
F Scale |Dogmatism | sémitism 
Mean 11.2 | 17.3 18.2 20.9 
S. D. 5.2 5.5 5.5 6.7 
Median 10.2 16.0 17.5 21.0 
Quartile 7.1 to 14.1 | 13.0 to 20.8 | 14.0 to 22.5 |17.0 to 23.5 
range | 


prejudice items need not take such an elab- 
orate format, one can compare the classic 
Bogardus Social Distance Scale, with a mean 
and median sentence length of seven words— 
one-third that of the Anti-Semitism items. 


Imbalance of Item Keying 


While ambiguity makes response bias likely 
for individual items, on a balanced scale— 
with half the items scored in each direction— 
such tendencies would be likely to cancel 
themselves out. To the degree that the ma- 
jority of items are scored in one direction, 
confounding between content and response 
bias is extended to the scale as a whole. With 
a completely imbalanced scale, response bias 
can systematically summate and lend a spuri- 
ous consistency within and between entire 
scales. Likert (1932, p. 46) recommended 
balanced scales, but the MMPI scales show 
various degrees of relative imbalance. Indeed, 
the main evidence for response bias in the 
MMPI has been based on correlations with 
the degree of imbalance on different scales. 
However, the confounding on the MMPI is 
only partial: the proportion of items scored 
in the “minority” direction remains substan- 
tial for most MMPI scales (e.g., 22% to 48% 
for the eight clinical scales). The presence of 
this minority suggests that the major factor 
in these scales may be content rather than 
response bias. 

In contrast, the authoritarianism scales in 
their final forms retain no such minority 
items. They represent an absolute, rather than 
a relative, imbalance of keying, and therefore 
an absolute confounding of authoritarian con- 
tent with agreement responses. Thus, these 
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scales depart from normal principles in pre- 
cisely the combination of features that make 
response bias likely to be a major factor: 
ambiguous statements and complete imbalance 
of item keying. 


SpecIrIC EVIDENCE 


The bulk of studies of response bias have 
considered evidence from correlations (in- 
cluding their factor analysis) between a scale 
and some other measure, In general, such 
evidence is not decisive since it does not re- 
move the confounding (partial or complete) 
between content and response bias for the 
scale itself. However, there are now at least 
two types of evidence for the MMPI that 
seem to meet this problem and indicate that 
response bias is relatively unimportant. This 
evidence will be considered and compared with 
the situation for the authoritarianism scales. 

Block (in press) made use of the fact that 
MMPI scales have only a relative imbalance 
of item keying. He randomly discarded items 
with the majority keying until he had “bal- 
anced” scales with the same number of items 
scored in each direction. He showed that these 
balanced scales correlated as highly as could 
be expected (given their shorter length) with 
the regular unbalanced scales, and that the 
balanced scales had essentially the same fac- 
tor structure as the regular scales. These find- 
ings give strong evidence that response bias 
does not play a major role in the MMPI, since 
items keyed in opposite directions seem to be 
measuring the same thing. Block suggests that 
the partial imbalance of the regular scales was 
an inadvertent and irrelevant result of their 
construction. 

In contrast, the authoritarianism scales in 
their final forms are untestable by such a tech- 
nique since they retain no “minority” items 
about which a balanced scale might be formed. 
At the same time, there is also evidence from 
the development of these scales that such a 
technique would fail to show an equivalence 
between items keyed in either direction. Al- 
though some rationale was initially presented 
for complete imbalance in the case of the 
Anti-Semitism scale (e.g., Adorno et al., 1950, 

p. 59), the arguments were not so compelling 
as to prevent the constructors of the F, Ethno- 
centrism, and Dogmatism scales from trying 


to write “minority” items where agreemen 
would be scored as nonauthoritarian. None of 
these items survived into the final forms o 
the scales, since in all cases they failed 
measure the same thing as the other item 
where agreement was scored as authoritarian 

The first form of the F scale used 38 item: 
for three of which agreement was scored a 
nonauthoritarian, In the item analysis usi 
the “Discriminatory Potential” of each item 


the majority items by the Mann-Whitney tesi 
despite the tiny sample of “minority” items.) 
The second form of the Ethnocentrism 
introduced a “first and only” Ethnocentrisn 
item scored in the minority direction (Adorno 
et al., 1950, p. 118). This item had a dis- 
criminatory potential close to zero, by far 
the worst of any item on the scale. Rokeach 
(1956) presents no data for the Dogmatism 
scale but states: “It is worth pointing out, 
however, that in our preliminary research we 
used dogmatism items worded in the opposite 
direction—that is, disagreement indicating 
highness. Such items had to be discarded be- 
cause they consistently failed to discriminate 
low from high dogmatic groups [p. 40].” Thi 
on all these scales, the attempts to construct 
items that would measure authoritarianism 
and not agreement were abandoned when these 
items failed to measure the same thing as the 
usual items confounding agreement and au- 
thoritarianism. In contrast to the MMPI, 
where Block argues that the relative imbal- 
ance is an inadvertent and irrelevant by- 
product, the absolute imbalance of the author- 
itarianism scales seems neither inadvertent 
nor irrelevant. 

A second technique for unconfounding con- 
tent and response bias involves trying to 
write reversals of the original statements. 

Rorer (1963) tried to write logical contra- 
dictories for all 550 statements of the MMPI. 
The original and reversed versions were given 
2 weeks apart and responses compared to the 
two versions of the same item. Subjects gen- 
erally gave opposite Tesponses (“true” and 
“false”) to the two versions, consistent with 
an interpretation in terms of content. Since 
this consistency was about that of repeating 
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the original MMPI itself (Rorer & Goldberg, 
1965), response bias again seems to be an 
unimportant factor in the MMPI. 

To reverse the MMPI, it is possible simply 
to take the content of an original as given, 
and—whatever this content may be—try to 
construct a logical reversal, such that a per- 
son logically should endorse one version and 
reject the other. Such a purely logical ap- 
proach is not sufficient for the authoritarian- 
ism scales. Evidently, a reversal should re- 
verse whatever authoritarian characteristics 
the original was supposed to represent. How- 
ever, as shown earlier, the authoritarian char- 
acteristics are represented by only part of the 
item content (the “irrational” aspect). Ac- 
cordingly, as Christie, Havel, and Seidenberg 
(1958) pointed out, a strictly logical reversal 
of the original statement may not reverse the 
authoritarian aspect. An example is provided 
by an early reversal used by Bass (1955). 
The original F scale item was: “The wild sex 
life of the old Greeks and Romans was tame 
compared to some of the goings-on in this 
country, even in places where people might 
least expect it.” This original is supposed to 
represent the authoritarian characteristics of 
Projectivity (“The disposition to believe that 
wild and dangerous things go on in the world; 
the projection outward of unconscious emo- 
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tional impulses”) and Sex (“Exaggerated con- 
cern with sexual ‘goings-on’ ”). Bass’ reversal 
was rated by his student judges as highly op- 
posite in meaning, but does not seem to re- 
verse these characteristics: “Some of the 
goings-on in this country, even in places 
where people might least expect it, are tame 
compared to the wild sex life of the Greeks 
and Romans.” 

On the other hand, a reversal that gives 
priority to reversing the authoritarian aspects 
of the original is open to the criticism made 
by Rorer (1965) and Samelson (1964) that a 
subject might logically agree (or disagree) 
with both versions, Such purely logical criti- 
cism does not do justice to the distinctive 
features of authoritarianism scales, but the 
dilemma is real and cannot be escaped en- 
tirely. Hence, the nature of authoritarianism 
scales makes it impossible to have perfect re- 
versals both of authoritarianism and of the 
actual item content. The best that can be done 
is to assign some priority—presumably to 
reversing authoritarianism, For example, re- 
versals should give priority to insuring that 
agreement with both versions cannot repre- 
sent a consistent authoritarian position, al- 
though disagreement with both versions might 
then logically represent a consistent non- 
authoritarian position. 


TABLE 2 


COMPARISON OF RESPONSES TO ORIGINALS AND REVERSALS: PROPORTIONS 
FOR DIFFERENT RESPONSE COMBINATIONS 


eee — ee 


Authoritarianism scales 
(N = 163) 


MMPI 
(N = 221) 
Response combinations* 
Original endorsed (“True” or “ y 
(1) “Acquiescence”: TT or AA 08 
(2) Content-consistent: TF or AD 32 
Original rejected (‘False” or “Disagree”) 
(3) Content-consistent: FT or DA St 
(4) “Negativism”: FF or DD 09 
1.00 
Relative content-consistency 
Original endorsed: (2)/(1) + (2) 80% 
Original rejected: (3)/(3) + (4) 85% 


Sporer and Goldberg (1965)—male and female samples combined. 
° 


body (1961)—American and English samples combin‘ 
Abbreviated: T = True F = False A = Agree, D = Di 


ed. 
‘Disagree, with responses to the original given first. 
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Peabody (1961) tried to write reversals 
along these lines for items from three authori- 
tarianism scales. The design was otherwise 
the same as that used by Rorer for the 
MMPI, and it is instructive to compare the 
two sets of results, presented in Table 2. 

Table 2 lists the four response combina- 
tions, and presents their proportions for each 
group of items. These results are also pre- 
sented as relative percentages: considering 
only cases where the original is endorsed or 
where it is rejected, what is the percentage of 
content-consistent responses on the reversal? 
These relative figures tend to have greater 
comparability for different items and samples. 

For the MMPI, the results suggest that 
content is the major factor; whether the orig- 
inal is endorsed or rejected, subjects tended 
(809% or more of the time) to give the oppo- 
site response to the reversal. For the authori- 
tarianism scales, it is only among disagree- 
ments with the originals that the results are 
similarly consistent with (anti-authoritarian) 
content. In contrast, subjects agreeing with 
the originals tend to agree also with the re- 
versal approximately two-thirds of the time, a 
result inconsistent with pro-authoritarian con- 
tent and suggesting agreement response bias. 
This asymmetry in the role of response bias is 
considered disturbing by Rokeach (1963) and 
Samelson (1964), but as was pointed out in 
the earlier article (Peabody, 1961), it is in 
accord with independent evidence that for 
most people response bias is toward “acqui- 
escence” (endorsement rather than rejection). 

The original authors of the scales would 
not have claimed that each item was a perfect 
measure of authoritarianism, but only that 
agreement would generally tend to reflect au- 
thoritarianism. By the same token, it need 
not be claimed that each reversal is a perfect 
one, but only that they generally reverse the 
authoritarianism of the original; the results 
of Peabody (1961) then show that agreement 
with originals generally reflects agreement re- 
sponse bias rather than authoritarian content, 
This conclusion is also in accord with the 
distinctive features of authoritarianism scales, 
considered earlier, that make them especially 
susceptible to response bias. 

In an effort to avoid this conclusion, Rok- 
each (1963) and Samelson (1964) have pro- 


posed reinterpretations to explain away the 
high frequency of double agreement with au- 
thoritarianism items. It remains to consider 
whether these proposals present any reason to 
change the conclusion that agreement with 
originals generally does not represent authori- 
tarian attitudes. 


Rokeach’s Counterproposals 


Rokeach (1963) proposes two hypotheses 
to account for the large amount of double 
agreement, as alternatives to interpreting it as 
response bias (which he calls “Hypothesis 
A”). His method of argument does not re- 
strict itself to data, but makes extensive use 
of hypothetical anecdotes from everyday life. 
Most of these anecdotes deal with ethnic 
prejudice, where his hypotheses appear rela- 
tively plausible, but leave the more general F 
and Dogmatism scales largely unprotected. 

Rokeach’s first hypothesis (“Hypothesis 
B,”) is that subjects may be telling the truth 
in agreeing with the original and deliberately 
lying in agreeing with the reversal. In the 
absence of further evidence, it is hard to find 
the justification for interpreting various agree- 
ments in opposite ways depending on which 
will preserve the validity of authoritarianism 
scales. As it happens, there is already some 
evidence against this hypothesis: Stanley and 
Martin (1964) have shown that independent 
indices of lying do not relate significantly to 
agreement with Dogmatism reversals. 

Rokeach’s second hypothesis (“Hypothesis 
B2”) is that subjects may be telling the truth 
in agreeing with two reversed statements since 
they may in fact have contradictory opinions. 
This hypothesis may be treated most effi- 
ciently by considering that its acceptance 
would not really preserve the relevant authori- 
tarian syndrome. Instead, the effect is similar 
to that pointed out earlier for the interpreta- 
tion of agreement bias as itself an authori- 
tarian characteristic; it would be to collapse 
the syndrome into a single specific quality. 
For example, the theoretical outline of Dog- 
matism, in Rokeach’s (1956) version, in- 
volved three major subdivisions, separated 
into 12 second-level subdivisions, separated 
further into a total of 24 final subdivisions. 
(In Rokeach’s 1960 version the outline is sub- 
stantially reshuffled, but still seems to involve 
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about 24 final subdivisions.) One of these 24 
seems to represent what the entire scale is 
measuring according to Hypothesis Ba: “The 
coexistence of contradictions within the belief 
system.” Consider an item supposed to meas- 
ure another theoretical characteristic: “In the 
long run the best way to live is to pick friends 
and associates whose tastes and beliefs are the 
same as one’s own.” Agreement with this item 
is supposed to reflect “Narrowing, referring to 
selective avoidance of contact with facts, 
events, etc., incongruent with the belief-dis- 
belief system.” However, 94% of those agree- 
ing with the original item also agree with the 
reversal: “In the long run, rather than have 
only friends and associates whose tastes and 
beliefs are the same as one’s own, it’s better 
to include some friends and associates with 
different tastes and beliefs.” It appears that 
agreement with the original did not generally 
teflect a selective narrowing. If Rokeach’s 
second hypothesis is taken seriously, it reflects 
instead the coexistence of contradictory be- 
liefs, and “Narrowing” disappears from the 
theoretical syndrome which the Dogmatism 
scale was supposed to measure. Indeed, if the 
hypothesis is applied generally, the scale no 
longer measures the initial elaborate syn- 
drome, but largely the single characteristic of 
the coexistence of contradictory beliefs. This 
is a possible hypothesis, but it seems unlikely 
that Rokeach intended to give up the Dogma- 
tism syndrome in this way. 

_Rokeach (1963, p. 307) invokes, in addi- 
tion, a number of detailed criticisms, some of 
which deserve brief comment: 

1. Rokeach objects to “. . . Peabody’s claim, 
which he makes without providing the slight- 
est independent evidence, that response set is 
a function of ambiguity of items.” The em- 
pirical and logical grounds for considering 
Tesponse set to be a function of ambiguity of 
items have been reviewed by Cronbach (1946; 
ETA as cited in the earlier paper (Peabody, 

2. Rokeach demands demonstration that: 
(4) items may be differentially ambiguous for 
Some subjects and not others, and (6) that 

Ose for whom the items are ambiguous 
typically agree. It is true that Point a was 
assumed; one might make the opposite as- 
Sumption that there are no individual differ- 


ences and any item is equally ambiguous for 
all subjects. As regards Point 6, Cronbach 
cites evidence that most subjects tend to en- 
dorse (‘Acquiescence”) rather than reject 
(“Negativism”) when the situation is ambigu- 
ous; three references giving the mean per- 
centage of such “acquiescence” as 62-69% 
for typical college student samples were cited 
in the earlier report (Peabody, 1961, p. 2). It 
is precisely this asymmetry that makes re- 
sponse set more likely to contaminate the high 
scores, 

3. Rokeach is disturbed by the lower vari- 
ance and reliability coefficients of reversed 
scales as compared with original scales, a 
finding in practically all reversal studies. 
However, from the asymmetry in response set 
just mentioned, it follows that agreements 
with the original are more likely to represent 
response bias, and disagreements a consistent 
anti-authoritarian content. In both cases, 
agreement with the reversal is likely, result- 
ing in a lower variance. The lower reliability 
coefficients of the reversals in most studies 
can be accounted for by the reduced variance, 
according to the well-known relation between 
the variance and the reliability coefficient 
(Gulliksen, 1950, Equation 5, p. 111). 


Samelson’s Counterproposal 


The reinterpretation proposed by Samelson 
(1964) is along more conventional psycho- 
metric lines. Samelson proposed that the re- 
versals of Peabody (1961), while generally 
reversing the direction of the original, may 
have displaced the location of the “neutral 
point” (where agreement changes to disagree- 
ment) on the hypothetical underlying content 
continuum. Depending on which way the lo- 
cation of the reversal was shifted, either dou- 
ble agreement or double disagreement might 
not represent response bias but a consistent 
content position. 

The problem is complicated by the fact 
mentioned earlier that generally there is only 
a partial correspondence between item con- 
tent and authoritarianism. It was expressed 
in the more crucial terms of authoritarianism 
in the earlier report (Peabody, 1961, p. 3): 
Ideally, a revérsal should be such that (a) a 
person with a pro-authoritarian attitude should 
disagree—and not be included among the 
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double agreements; (5) a person with a non- 
authoritarian attitude should agree—and not 
be included among the double disagreements. 

The less crucial question of double dis- 
agreement may be considered first. To the 
extent that the reversal states an extreme anti- 
authoritarian position (ie, is an opposite 
rather than a reversal), it is perfectly con- 
sistent for a subject with a moderate position 
to disagree with both statements. This un- 
doubtedly occurred frequently in early re- 
versal studies where extreme reversals were 
common—e.g., for the original F scale item 
“The businessman and the manufacturer are 
much more important to society than the 
artist and the professor” an early reversal was 
“The artist and the professor are much more 
important to society than the businessman and 
the manufacturer [Bass, 1955; Leavitt, Hax, 
& Roche, 1955].” Rorer (1963) has pointed 
out that such double disagreements would 
misleadingly lower the apparent content-con- 
sistency 


(Peabody, 1961, p. 5), and so conceal the 
degree of asymmetry (“acquiescence”) in 
response bias, 


It should be emphasized that this possibility 
of extreme reversals permitting consistent 
double disagreement is not the important issue 
raised by Samelson regarding the study of 
Peabody (1961). Substantial double disagree- 
ment occurred on a few items (Samelson men- 
tions the one F scale item where this clearly 
occurred). However, in general, the propor- 
tion of double disagreement was relatively 
low. 

Instead of double disagreement, Samelson, 
like Rokeach, is primarily concerned with 
the high level of double agreement, which 
casts doubt on the interpretation of agree- 
ment with the original as generally Tepresent- 
ing authoritarian content. To try to save this 
crucial interpretation, Samelson’s proposal 
must generally be that the reversals were not 
too extreme but too moderate, “overlapping” 
with the original so that double agreement 

might represent a consistent authoritarian 
position. On the other hand, in writing the 
reversals, priority had been given to trying 
to prevent just this possibility of double agree- 


ment representing pro-authoritarian content. l 
Samelson’s proposal may be reasonable with- 
out being true. 

Samelson’s main argument, like Rokeach's, 
is on a priori grounds: his hypothesis would 
permit authoritarianism scales to be treated in 
terms of standard scaling theory, and is there- 
fore preferable on grounds of parsimony to a 
response bias interpretation. However, Oc- 
cam’s razor (Samelson, 1964, p. 340) cuts 
both ways; in view of the distinctive features 
of the authoritarianism scales, it can be more 
parsimonious to treat them as basically dif- 
ferent from standard scales. There are addi- 
tional facts that are also interpreted more 
parsimoniously in terms of response bias than 
of content—for example, the fact that agree- 
ment with authoritarianism items tends to 
relate positively to agreement with other items 
whether this really fits theoretically (author- 
itarianism-xenophobia) or not (authoritarian- 
ism-xenophilia). 

In addition, Samelson correctly cites a num- 
ber of specific findings from the earlier study 
(Peabody, 1961) and shows that they are | 
compatible with his interpretation. In gen- 
eral, these findings are also compatible with — 
an interpretation in terms of response bias, 
and do not help decide between them. Samel- 
son shows that his interpretation might be 
true; it remains to be seen whether it is likely 
in fact. 

In view of the admitted difficulties in re- 
versing authoritarianism items, one would 
have thought that Samelson would cite ex- 
amples of reversals that are shifted in such a 
way as to permit the double agreement repre- 
senting authoritarian content that he postu- 
lates. The clearest examples should be À 
those items giving the largest proportions of 
double agreement, which—if Samelson’s hy- 
pothesis is generally correct—should reveal a 
massive degree of overlap, including major 
authoritarian positions that are consistent 
with double agreement. Accordingly, given the 
usual space limitation, Table 3 presents those 
seven items on the F scale where double agree- 
ment represented over half the responses in 
the combined samples, 

In examining these items, the question is 
not whether there is perfect logical reversal 
of the original, nor whether some reversals 
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Item No.,* Original (O), Theoretical characteristics and Reversal (R) 


13.0: What the youth needs most is strict discipline, rugged determination, and the 
(Authoritarian aggression + Power and 


R: There are things young people need at least as much as strict discipline, rugged 


31.0: Nowdays more and more people are prying into matters that should remain 


R: People are left as free as ever to keep personal and private matters to themselves; 
they might be more willing to discuss them with those who are genuinely interested. 


21.0: Young people sometimes get rebellious ideas, but as they grow up they ought 


R: Young people ought not to give up all of their rebellious ideas as they grow up. 
2. O: No weakness or difficulty can hold us back if we have enough will power. 


R: Difficulties and weaknesses often hold us back, no matter how great our will 


1. O: Obedience and respect for authority are the most important virtues children 
R: There are other virtues children should learn at least as important as obedience 


42.0: No sane, normal, decent person could ever think of hurting a close friend or 
R: Even a normal, decent person will sometimes think of things that might hurt 


23.0: What this country needs most, more than laws and political programs, is a few 
courageous, tireless, devoted leaders in whom the people can put their faith. (Authori- 


R: This country needs better laws and political programs at least as much as a few 


TABLE 3 
F-Scavy Treas ro wimcn A Mayorrry or Susjecrs* Gave Doosre Acserwenr 
= = oC 
Proportion: of each 
tapove combination®™ 
AA AD DA DD Jih 
B oo 21 .02 
will to work and fight for family and country. 
“toughness”). 
determination, and the will to work and fight for family and country. 
07 21 05 
personal and private. (Anti-intraception + Projectivity). 
13 23 02 
- to get over them and settle down. (Authoritarian submission). 
J 
a 12 24 02 
(Power and “toughness”’). 
power. 
S01 44° o 
should learn. (Conventionalism + Authoritarian submission). 
. and respect for authority. 
st 06 40 .00 
i relative. (Authoritarian submission). 
" a close friend or relative. 
MM 
12 30 07 
Le tarian submission + Power and “‘toughness”’). 
re 
1) courageous, tireless, devoted leaders in whom the people put their faith. 
O 


Pep rntined samples from Peabody (1961) of 88 American and 75 English. N = 163 except for a few cases where subjects 


an item. 
As in Table 2, A = Agree, D = Disagree, with response to 
numbers and theoretical characteristics are from Adorno et 


eltem 


Would logically permit double disagreement, 
nor even whether one can conceive of some 
Content position that could logically permit 
Uble agreement. If Samelson’s hypothesis is 
to serve as a general explanation, the question 
is whether the reversals overlap the originals 
very drastically—so drastically as to include 
Content positions which (a) are important 
enough to permit consistent double agreement 
by over half the subjects, and (b) are also 
oo in some specifiable way. 
car seemingly minor changes in the re- 
Netsals (e.g., on Item 13, “youth” was changed 
RE” 8 people”) were made at the time to 


the oj 


given first. 
(1950, pp. 255-257). 


try to reduce memory for the specific item 
two weeks later, and might now be criticized. 
For Item 31, there was not a simple reversal 
but an attempt to reverse both barrels of the 
original: (a) there are matters that should 
remain personal and private (Anti-intracep- 
tion); (b) nowadays there is more prying 
into these matters (Projectivity). As a result, 
the reversal clearly permits consistent double 
disagreement, and conceivably consistent 
double agreement as well. Probably the best 
case for Samelson’s hypothesis could be made 
for Item 42, where the reversal attempted to 
deal with the possible double meaning of the 
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word “hurting.” In this case, there clearly are 
positions that logically permit double agree- 
ment, and possibly they are authoritarian as 
well 


On the other hand, two of the items (Nos. 
2 and 21) are close to logical opposites, 
permitting consistent double disagreement 
(which, in fact, rarely occurred) but hardly 
the huge obtained proportions of double agree- 
ment, Three items (Nos, 1, 13, and 23) seem 
close to logical reversals—given the inherent 
inexactness of everyday language. In sum- 
mary, for these seven items where Samelson’s 
postulated overlap should appear most power- 
fully, there are two that are questionable and 
five where double agreement is unlikely to 
represent consistent authoritarian content 
(and suggests response bias). Nevertheless, 
on these latter five items, double agreement 
with both versions actually occurred for 61% 
of all subjects. Hence Samelson’s proposal, 
although perfectly reasonable on a priori 
grounds, does not seem plausible as a general 
explanation. Those wishing to maintain Sam- 
elson’s hypothesis—and belief that authori- 
tarianism scales generally represent authori- 
tarian content—should specify the important 
authoritarian positions that consistently per- 
mit such a high level of double agreement. In 
general, the counterproposals of Rokeach and 
Samelson do not seem to require a modifica- 
tion of the conclusion from both general con- 
siderations and the specific evidence: the dis- 
tinctive features of authoritarianism scales 
are such that agreement with the items is 
generally more likely to represent response 
bias rather than authoritarian content. 


IMPLICATIONS 


The problems of authoritarianism and re- 
sponse bias are complex, and it is understand- 
able that they have often been considered in 
terms that are likely to be too simple and 
categorical. There is a tendency to assume 
that response bias is automatic and mechani- 
cal, occurring on every item for every sub- 
ject, so that it can be adequately measured 
by a raw score for the number of responses in 
a given direction. There is a related tendency 
to assume that response bias must be entirely 
general, so that findings of low correlations 


between raw scores on diverse instru 
are taken as disproving the importance 
response bias within an instrument. 

Eventually it may be necessary to del 
with a reality that is far more complex. Cros 
bach (1946; 1950) early argued that respons 
bias should be considered as a tendency te 
respond in a particular way when the subject 
is uncertain, This will not occur on every item 
for every subject, but only to the extent that 
a subject cannot answer an item on the bast 
of content, Adequate measures of responsi 
bias cannot use raw scores but should involve 
at least some estimate of the extent to whic 
the subject is responding on the basis of com 
tent (cf. Peabody, 1964). Similarly, as re 
gards generality, Cronbach (1946) pointed 
out that if content is relatively specific, low 
correlations should be expected between raw 
scores on different materials, even if the re 
sponse tendency when uncertain were a com 
sistent probability for an individual: “Re 
sponse sets operate in proportion as a situs 
tion is unstructured, and the student who finds 
a psychology test unstructured because of his) 
ignorance, may be able to answer his chemis 
try test on the basis of knowledge. Unles — 
degree of structuration could be equated for 
all individuals, correlations of ‘response set 
scores’ from test to test are meaningless 
[p. 486].” However, even if the individual’ 
tendency to respond when uncertain should 
itself be relatively specific or temporary, it 
might still have an important effect on a givet. 
administration of a given instrument. 

The authoritarianism scales should also bt 
considered complex in several ways. Despite 
the similarity in ambiguity between the Anti 
Semitism items and those of other scales, it Ë 
quite possible that the prejudice scales diffe 
in important respects. The ambiguity meani 
that Anti-Semitism scores are also unlikely to 
reflect the intended syndrome of specific atti- 
tudes, but it is at least likely to be clear that 
the items generally concern prejudice, and tht 
subject who wants to demonstrate gen 
tolerance (or intolerance) can respond ac 
cording to this general attitude. In contrast, 
the general topic of the F or Dogmatis® 
scale seems intrinsically more vague, and the 
subject is less likely to recognize any ovi 
theme about which he might express himself. . 
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“Tt is quite possible that authoritarianism 
gales may involve several kinds of content 
gs well as response bias. Many of the nu- 
erous correlations of the F scale are of the 
order of .3 or 4 and could, in view of a reli- 
ability frequently around .7, represent a num- 
ber of different aspects of what the F scale 
May measure. What is measured may vary for 
different groups: although consistent pro- 
authoritarian attitudes are generally uncom- 
mon, they pent found among cers 
particularly among a ver y so- 
elite. Thus, it need not be doubted 
there are “real” authoritarians, some of 
whom may spontaneously make statements 
like the scale items (cf. Samelson, 1964, p. 
$42). The evidence indicates, however, that 
they are usually not the majority of those 
getting high scores on authoritarianism scales. 
The evidence is that most high scorers simply 
Po have the complex set of attitudes re- 
for the theoretical syndrome: what 
connects the varied aspects of the scale is not 
their ideologies, but their repeated uncertainty 
when faced with a series of ambiguous state- 
ments. The point is put with characteristic 
by Adorno (Adorno et al., 1950), 
Who speaks of “the widespread ignorance and 
‘Confusion of our subjects in political matters, 
a on that might well surpass what 
even a skeptical observer should have antici- 
pated. If people do not know what they are 
about, the concept of ‘opinion,’ which 
is basic to any approach to ideology, loses 
much of its meaning [p. 658].” This insight 
Was unfortunately overwhelmed by the desire 
tob in the scales as a ready means of 
i authoritarians, so that one can 
Proceed with “substantive research [Rokeach, 
1963, p. 309]” concerning the correlates of 
the supposed authoritarianism. 
In fact, the characteristics found to cor- 
telate with the scales in previous research 
(eg, See earlier summaries in Christie, 1954, 
and Christie & Cook, 1958) are more likely 
to represent ignorance and confusion than 
Teal authoritarianism. For the great bulk of 
Tésearch using American college students, the 
Seales seem to measure primarily a contrast 
“tween (a) low scorers with consistent anti- 
ran attitudes, and (b) high scorers 
lencies to respond “agree” to complex 


statements in the absence of any correspond- 
ing attitudes. On a behavioral rating, these 
low scorers are likely to continue to be non- 
authoritarian and the high scorers less so. 
Thus the scores may have validity in predict- 
ing relative behavior, Indeed, as regards the 
practical question that inspired The Authori- 
tarian Personality, the high scorers would 
presumably be less resistant to a right-wing 
movement (Peabody, 1961, p. 9). Neverthe- 
less, it makes both a theoretical and a prac- 
tical difference whether they are confused and 
apathetic or fanatical true believers. 

One major difference is in the interpreta- 
tion of the considerable evidence that high 
scorers are, in several ways, unsophisticated. 
For the orthodox interpretation, high scores 
result from a complex set of authoritarian 
attitudes, with authoritarian personality 
structure to match. These authoritarian char- 
acteristics are also the source of primitive- 
ness and simplicity of cognitive structure 
(e.g., intolerance of ambiguity; the whole 
set of structural properties developed by 
Rokeach). An interpretation in terms of re- 
sponse bias would turn this sequence upside 
down: it is primarily cognitive simplicity— 
involving a paucity of differentiated attitudes 
and a lack of verbal sophistication—that 
results in response bias and high scores on 
the scales, which may or may not be accom- 
panied by specifically authoritarian char- 
acteristics. 

It is enlightening to go beyond the college 
sophomore and consider the attitudes of the 
general public as expressed by national sam- 
ples (Campbell, Converse, Miller, & Stokes, 
1960). It is plausible (e.g., Christie et al., 
1958, p. 145) that those outside the ivied 
halls might show greater attitudinal consist- 
ency than college students. On the other 
hand, a response-bias point of view would 
expect those with less verbally articulated 
attitudes to show even more response bias 
than college students. This later expectation 
is supported by the evidence. This evidence, 
as in most studies using authoritarianism 
reversals, consists of correlations between 
originals and reversals. The more complete 
results in Table 2 show that the content- 
consistency of those disagreeing with authori- 
tarian originals is even greater than the in- 
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the F scale, compared with .85 for the 
MMPI.) Correlations slightly in the content 
direction are also found 
reversal studies using college students (e.g. 
see the summary in Chapman & Bock, 1958). 

In contrast, for a national sample using F 
scale originals and reversals from Christie, the 
overall correlation is in the direction of 


i 
i 


the direction of content-consistency (Camp- 
bell et al., 1960, pp. 512, 515). “People who 
look like ‘high authoritarians’ where agree- 
ment means an authoritarian response tend to 
look like ‘low authoritarians’ on the reversed 
items.” Other evidence that fits authoritarian 
theory can be found only for the college- 
educated subsample (Campbell et al., 1960, 
p. 514). 

Similarly, Campbell et al., (1960, pp. 210- 
212; 508-509) corroborate the negative find- 
ings of McClosky (1958) for his Conserva- 
tism scale. For a national sample, the 
measure, which is “subject to a severe prob- 
lem of response set,” generally does not yield 
correlations that fit theoretical assumptions 


Even more revealing is the independent 
evidence by Campbell et al. (1960, Ch. 10, 
also Ch. 9) of the poverty and primitiveness 
of attitudes in the general population. Despite 
lenient criteria, only 2.59% of the population 
showed evidence of using any ideological di- 
mension, with another 9% showing evidence 
of using even a single ideological concept. 
The great bulk of the population, and even 
the majority of the college-educated, simply 
do not reveal the complex attitudes needed 
to sustain the theoretical syndromes of au- 
thoritarianism. The upper 2.5% presumably 
includes most of the social scientists who have 
been interested in the theories of authori- 
tarianism, and who have tended to project 


the complexity of their own views onto 
subjects (cf, Asch, 1952, p. $43; Cam 
et al., 1960, p. 212). The ignorant and 


preceding discussion may seem 
demic in the light of the 1964 United Stai 
election, indicating a substantial support for 


foreseen in The Authoritarian Personality, 
The practical danger remains, however, if the 
preceding analysis is applied: most of th 
support may be interpreted as disgruntle 
ment, either vague or else focused on single 
issues (e.g., desegregation), rather than the 
organized ideology of conservative intellect 
als. Indeed, a case has been made (Rovere, 
1964) that among those lacking such an 
ideology was the candidate himself. 

One conclusion seems clear for the measure- 
ment of attitudes and personality, faced with 
twin dangers like agreement response bias and 
faking. It is supported by general considera 
tion of the distinctive ways that authoritart } 
anism scales depart from normal self-report 
measurement and by the evidence from 
studies of college students and surveys of 
general public: It is not possible to establis 
the existence of complex ideologies by th 
fact that people tend to respond “agree” 
rather than “disagree” to a series of comple 
statements. If complex statements are p 
sented to those who are simple-minded, th 
have no choice but to answer on the basis of 
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EFFECTS OF MOTIVATION ON THE AVAILABILITY 


AND RETRIEVAL OF MEMORY TRACES* 


BERNARD WEINER è? 
Center for Personality Research, University oj Minnesota 


A review which analyzes a vast array of studies relating motivation and mem- 
ory is presented. Investigations in which the motivational manipulation oc- 
curred during trace formation are distinguished from studies in which the 
manipulation occurred during trace storage or trace retrieval, The review in- 
cludes a series of investigations by the author which varied the incentive for 


retaining stimuli. The general conclusion is that many studies in the area are 
methodologically inadequate, and have yielded conflicting results. However, 


there are studies which provide strong evidence that memory can be influenced 


by nonassociative factors. 


Retention is generally conceptualized as a 
multistage process (Cameron, 1947; Melton, 
1963; Rapaport, 1942; Underwood, 1964). 
The initial period involves sensory or idea- 
tional registration and the subsequent fixation 
of the memory trace of that event; this is the 
stage of learning or trace formation. The 
second process in the sequence includes the 
interval in which the trace is latent, yet 
potentially available for evocation. During 
this “storage” period it has been hypothesized 
that dynamic processes such as reverberation 
(Hebb, 1949), perseveration and consolida- 
tion (Glickman, 1961; Müller & Pilzecker, 
1900; Walker, 1958), integration and or- 
ganization (Bartlett, 1932; Koffka, 1935), 
and autonomous decay (Brown, 1958; Koffka, 
1935) proceed without additional sensory 
input. During storage the trace may also be 
modified as a function of new experiences: 
unlearning (Barnes & Underwood, 1959; 
Briggs, 1954; Melton & Irwin, 1940) and 
interference (McGeough, 1932; Postman, 
1961; Underwood, 1957) have been identi- 
fied as factors causing trace alteration. The 
third step in the memory sequence is the 
recall or evocation of the trace: “The image 
is awakened and ecphorized [Semon, 1904],” 
and there is a recurrence of the initial 
stimulus. 

It is apparent that variations in the condi- 
tions extant at any of the three points in the 
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memory sequence can affect subsequent recall. 
For example, one consequence of additional 
sensory input during the presentation of a 
to-be-remembered unit is a decrease in the 
probability that the unit will be encoded and 
become a memory trace (Broadbent, 1958). 
Throughout the period of trace storage, 
re-presentation or rehearsal of the original 
stimulus may prevent the onset of the autono- 
mous decay of that trace (Brown, 1958). 
Conversely, the probability of forgetting a 
stimulus increases as a function of the amoug 
of information reduction required during the 
period of storage (Posner & Rossman, 1965). 
At the time of recall, change of task ot 
situation variables from those existing at 
the time of original learning decreases the 
probability of recall (Melton, 1963). One ex- 
ception to this principle is illustrated in 4 
study by Bilodeau, Fox, and Blick (1963). 
They facilitated the evocation of to-be- 
remembered units by presenting stimuli | 
highly associated with the to-be-recalled items 
as “reminders” during the recall period. 
Inasmuch as the dominant theory of recall 
today is that of associationism (Asch, 1964), 
the overwhelming majority of experimental 
investigations of retention have manipulated 
what generally are classified as associative 
variables. During the period of trace forma- 
tion, the independent variables related to 
subsequent recall have included the degree of 
original learning, frequency of stimulus pres- 
entation, meaningfulness of items, etc. In the 
course of stimulus storage, the manipulations 
have involved the similarity of the inter- f 
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polated activity and original learning, degree 
of learning of the interpolated material, etc. 
Finally, during the trace retrieval, associative 
manipulations embrace the vast array of work 
subsumed under the rubric of transfer 
(Melton, 1963). All these experiments have 
greatly enhanced our knowledge about the 
role of stimulus-response units or habit 
factors in retention. 

There remains, however, a number of rela- 
tively dormant questions pertaining to the 
relation between nonassociative factors and 
retention. It seems intuitively reasonable to 
submit that variations in the motivational 
variables existing at any of the stages in the 
memory process will also affect subsequent 
recall, 

This paper undertakes an analysis of 
motivational factors which have been em- 
ployed in studies of retention. Following 
Brown (1961), it is assumed that motiva- 
tional concepts are necessary in situations 
where vigorous responses are elicited by weak 
stimuli, and when there is variability of re- 
Sponse in the presence of constant stimulating 
conditions, Motivational variables shall not 
be limited to the concept of drive, as Brown 
Suggests. Rather, as recommended by Atkin- 
Son (1964), any contemporary determinant 
of behavior is considered to be a motivational 
construct (although for present purposes 
associative or habit factors will be omitted). 

is includes temporary and relatively stable 
States of the organism (arousal level, hyp- 
notic versus normal condition, motives, atti- 
tudes), and characteristics of environmental 
objects or events (completed and incompleted 
tasks, succeeded and failed tasks, stressful 
and nonstressful situations, pleasant and un- 
Pleasant experiences, positively and nega- 
tively evaluated words, consonant and dis- 
Sonant messages, positive and negative 
Mcentives) , 

is paper reviews studies of memory using 
human subjects which have employed motiva- 
tional manipulations; the investigations in 
which manipulations occur during the periods 
of trace storage or trace utilization will be 
ifferentiated from those in which the ma- 
Mpulation is operative during the period of 
trace formation. This review presents an over- 


view of general areas of study, and is not 
intended as an exhaustive literature survey. 
The relevant literature published before 1940 
generally will be neglected; Rapaport (1942) 
has thoroughly reviewed this material, Fur- 
ther, the literature examining the influence 
of drugs on retention and the effects of moti- 
vation on the mechanisms involved in the 
memory process will not be included in this 
review. The final section of this paper will 
include experiments currently being con- 
ducted by this writer. These studies relate 
the magnitude of incentives to the availability 
and retrieval of memory traces. 


MOTIVATIONAL MANIPULATIONS DURING 
TRACE FORMATION 


All but a few investigations relating moti- 
vation to memory have employed motivational 
manipulations at the time of stimulus regis- 
tration, and subsequently measured the 
amount of stimulus retention. Keppel (1964) 
and Underwood (1954, 1964) have argued 
convincingly that this type of procedure is 
replete with dangers. If the manipulation af- 
fects the degree of original learning, then it 
is impossible to attribute differences in recall 
to inequalities in retention (trace storage or 
trace utilization). This does not mean that 
the retention function of the material learned 
under heightened versus normal motivating 
conditions does not differ; it does mean that 
without further controls the data are con- 
founded and no definitive conclusions can 
be reached concerning the relation between 
motivation and retention. 


Individual Differences and the Recall of In- 
completed and Completed Tasks 


In a number of studies conducted under 
the aegis of Lewin (e.g., Marrow, 1938a, 
1938b; Zeigarnik, 1927), superior recall of 
incompleted over completed tasks (the Zei- 
garnik effect) was demonstrated. However, 
other investigators (e.g., Glixman, 1949; 
Rosenzweig, 1943) have found greater recall 
of completed than incompleted tasks. One 
resolution of this apparent contradiction was 
suggested by Atkinson (1953). He established 
that subjects high in need for achievement 
recall more incompleted than completed tasks 
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in situations where achievement motivation 
is aroused, while subjects low in need for 
achievement tend to recall more completed 
than incompleted tasks under achievement- 
oriented conditions. Analysis by Atkinson of 
the subjects used by Zeigarnik and Marrow 
led to the conclusion that these subjects were 
relatively high in need for achievement. On 
the other hand, subjects participating in the 
experiments of Glixman and Rosenzweig were 
judged to be relatively high in fear of failure. 
An experiment by Green (1963) demon- 
strated that volunteer subjects exhibit a 
greater Zeigarnik effect than nonvolunteer 
subjects. Data cited by Atkinson (1953) 
indicate that volunteers are likely to be high 
in need for achievement; Green’s results tend 
to confirm the postulated association between 
achievement motivation and task recall. 

Other investigators (Alper, 1957; Ericksen, 

1954) have proposed alternative interpreta- 
tions of the contradictory findings in this 
area. Alper and Ericksen demonstrated that 
individual differences in “ego-strength” inter- 
act with the type of experimental conditions 
to differentially affect task recall. Individuals 
high in ego-strength exhibit greater recall of 
interrupted tasks under task-oriented condi- 
tions, and greater recall of completed tasks 
under ego-orientation. Conversely, individuals 
low in ego-strength recall more completed 
tasks under task conditions and more inter- 
rupted tasks under ego-involving instructions. 

While the interpretations of Alper and 
Atkinson are apparently conflicting, the data 
do indicate that personality dispositions to 
strive for success and/or to respond defen- 
sively to threat do affect recall. Further, 
Atkinson and Raphaelson (1956) demon- 
strated that potentially many motive states 
may influence recall. They found that sub- 
jects high in need for affiliation recall more 
interrupted tasks than subjects low in af- 
filiative needs, given instructions which arouse 
affiliative motivation, 

For further discussion of experiments in- 
vestigating the recall of completed and 
incompleted tasks and closer examination of 
the methodological difficulties, the reader is 
directed to reviews by Alper (1952) and 
Butterfield (1964), 


Success and Failure 


Analysis of the recall of tasks 
eventuate in success or failure overlaps 
the study of the recall of incompleted r 
completed tasks. Numerous investigator 
(e.g., Atkinson, 1953; Rosenzweig, 1943) 
specify conditions in which a completed tas 
connotes success and an incompleted 
connotes failure. (For a reversal of 
intrinsic association, see Marrow, 1938b. 
There may be, however, important procedura 
differences between studies varying succe 
and failure and those manipulating the com 
pletedness of a task. Success and failure ca 
be varied via experimental evaluation of th 
subject’s performance (e.g., Russell, 195: 
Zeller, 1952). Employing false feedback t 
manipulate subjective level of performan 
has a distinct advantage: the original pe 
formance of a “success” group and a “failure 
group may be equated. The equalization 0 
prior experiences is not possible in studie 
employing tasks which vary in their degre 
of completion. 

It is regrettable that this advantage has no 
been utilized. Either the false feedback ha 
been employed in conjunction with task co m 
pletion (Smock, 1957), or it has been used it 
experiments which do not allow evaluation 0 
retention. These latter studies (e.g., Russell 
1952; Zeller, 1952) reported success or fail 
ure at a learning task, and then had the sub 
jects relearn the material following the fa 
feedback, Differential retention was infe ec 
from the subsequent speed of relearning. Yel 
experiments (e.g., Weiner, 1965, 1966) have 
demonstrated that success and failure affe 
subsequent level of performance, including 
the speed of learning. Hence the speed 0 
relearning the original list is completely con: 
founded with retention, and no conclusion! 
concerning retention can be reached from 
these experiments. 


Stress 


A number of investigators have varied the 
degree of “psychological stress” during stimu: 
lus input and measured subsequent retention 
Stress has been experimentally induced in 4 
variety of manners. The dominant method is 
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to employ instructions which convey to the 
subject that his performance is being evalu- 
ated, and that the outcome reflects his ability 
or intelligence (e.g., Alper, 1946; Gilmore, 
1954; Kendler, 1949; Rosenzweig, 1943). 
Generally, investigators interested in this 
aspect of retention employ two degrees of 
stress in their experiments: an evaluative or 
ego-involving condition, and a nonevaluative 
or task condition. The relative recall of com- 
pleted and incompleted tasks is frequently 
employed as the dependent variable. 

Results of these investigations are conflict- 
ing. Forrest (1959), Gilmore (1954), Kendler 
(1949), Lewis and Franklin (1944), and 
Rosenzweig (1943) are among the investi- 
gators who find that the relative recall of 
completed (successful) to incompleted (failed) 
tasks increases as stress increases. Other in- 
vestigators, however, find this relationship 
does not hold true for all subjects. Alper 
(1957) concluded that individuals with 
weak egos will recall more incompleted than 
completed tasks in stressful situations. Atkin- 
son (1953) found greater recall of incompleted 
than completed tasks in stressful (achievement- 
oriented) conditions among subjects classified 
as high in need for achievement. Ericksen 
(1952) and Caron and Wallach (1957) 
demonstrated that there is a subgroup of indi- 
viduals who recall more failed than succeeded 
tasks in an evaluative orientation, and that 
these individuals exhibit selective perception 
and different patterns of recall in other 
situations, 

Stress has also been manipulated by em- 
ploying conditions which are thought to be 
anxiety provoking. Smock (1957) employed 
Jigsaw puzzles composed of Blacky picture 
designs as his tasks and demonstrated that 
recall of tasks which arouse anxiety is damp- 
ened. On the other hand, Wittrock and 
Husels (1962) showed that subjects who 
expect to receive an exam retain more of an 
Irrelevant passage learned before the exam 
Period than subjects not expecting to be 
tested, 

In summary, the results of the effects of 
Stress on retention are conflicting. Further, 
there is a blatant learning-retention confusion 
in many of the studies. On the basis of 


these investigations, no definitive statement 
concerning the relation of motivation and 
retention can be made. 


Affective Experiences and Word Value 


During the period of 1920-1940 a plethora 
of studies concerning the retention of pleasant 
and unpleasant experiences was conducted 
(see reviews by Osgood, 1953, and Rapaport, 
1942). Many of these experiments “tested” 
the psychoanalytic theory of repression by 
relating retention to hedonic tone. Since the 
1940s, there has been little additional litera- 
ture in this area although the basic question 
concerning the relevance of affective quality 
and affective intensity to memory remains 
essentially unanswered. 

To the best of this writer’s knowledge, 
Turner and Barlow (1951) conducted the 
last study bearing on the recall of pleasant 
and unpleasant experiences. They found no 
differences in recall as a function of type 
of affective experience; but intense experi- 
ences, whether pleasant or aversive, were 
retained better than neutral experiences. 
Rapaport (1942) reached a similar conclu- 
sion on the basis of the studies he reviewed. 
Dudycha and Dudycha (1941) concluded 
that “reports are better than two to one in 
favor of recall of pleasant memories as against 
unpleasant ones [p. 678].” Thus there is some 
evidence that the quality as well as the in- 
tensity of affective experience affects retention. 

In addition to the recall of affective experi- 
ences, there has been a variety of studies 
pertaining to the retention of affective words. 
Jones (1958) and Keet (1948), in accord- 
ance with psychoanalytic conceptions, found 
greater memory disturbances for traumatic 
than neutral words. However, Grummon and 
Butler (1953) and Merrill (1952) were un- 
able to replicate Keet’s finding. Similarly, 
Laffel (1952) found no differences in reten- 
tion of disturbing or neutral words. Klugman 
(1956) also found that the character of af- 
fective words did not influence retention, 
although the intensity of affect was related 
to later recall. Kott (1955) added another 
dimension to the already confusing facts. He 
demonstrated that there is an interaction 
between the affective quality of the words 
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and the type of subject attempting to retain 
the material. In his study, highly anxious 
subjects retained sexual words better than 
neutral words, while normals were better 
able to retain neutral than sexual words. 
These findings were opposite to his predic- 
tions. Recently, Amster (1964) found that 
positively evaluated words were recalled 
better than negatively evaluated words, and 
both of these were retained better than neu- 
tral words. Although Amster controlled for 
the associative value of the material, she 
attributes the differential retention to other 
associative principles. 

In summary, the findings in this series of 
studies are not definitive. There does seem to 
be evidence that the intensity of affect at the 
time an event occurs is related to subsequent 
recall. The data concerning the association 
between the quality of affect and memory is 
less substantial. There is some evidence, 
albeit inconclusive, that pleasant experiences 
and material are retained better than un- 
pleasant events, 


Arousal 


Following the publication by Bills (1927), 
there was a flurry of studies investigating the 
influence of dynamogenic factors on perform- 
ance, Indirect descendents of this work in- 
clude experiments relating muscular tension 
to recall. Smith (1953) found that muscular 
tension declines more following a completed 
than incompleted task. He interprets the 
Zeigarnik effect with Hebbian principles of 
organization and persisting central neural 
processes. Forrest (1959) cast doubt on the 
adequacy of Smith’s data. Forrest found 
greater recall of incompleted tasks under non- 
stress conditions, and greater recall of com- 
pleted tasks under stress conditions. In both 
conditions, the muscular tension associated 
with an incompleted task was greater than 
the tension corresponding to a completed 
task. 

A different approach to the relation be- 
tween arousal and retention has been followed 
by Walker (1958). Walker has presented a 
general theory of learning and retention 
which interrelates concepts of arousal, action 
decrement, and consolidation. The major pre- 


diction of this theory is that high arousal dur- 
ing learning makes the trace of the material 
less available for immediate recall, but results 
in greater permanent memory. Kleinsmith and 
Kaplan (1963), using skin resistance to 
measure degree of arousal, found immediate 
recall of stimuli associated with high arousal 
to be relatively poor; but, following 45- 
minute or 1-week intervals, recall of these 
stimuli was relatively high. Walker and Tarte 
(1963) replicated these results using verbal 
stimuli a priori classified as highly arousing; 
they also replicated the findings of Klein- 
smith and Kaplan, using the skin resistance 
measure of arousal. Kleinsmith, Kaplan, and 
Tarte (1963) also demonstrated a positive 
relationship between degree of arousal and 
long-term retention. 

Walker’s theory has received additional 
support from a series of studies investigating 
the effects of delayed auditory feedback 
(DAF) on retention (King, 1963; King & 
Dodge, 1965; King & Wolf, 1965). King and 
his associates have found that the immediate 
recall of a story is weakened when part of 
that story is read aloud under DAF condi- 
tions. However, following a 24-hour time 
interval the differences in recall between the 
DAF and the control groups decrease or dis- 
appear entirely. King and Wolf concluded 
that in the DAF condition: “The informa- 
tion got into the central nervous system 
during the learning trial, but it was available 
only for delayed use [p. 138].” King and 
Wolf also present physiological evidence that 
DAF causes relatively high arousal. Conse- 
quently, their data partially substantiate pre- 
dictions from Walker’s theory of action 
decrement. King and his associates, however, 
did not find that following a 24-hour delay the 
recall of stimuli learned under the DAF con- 
dition is greater than the recall of stimuli 
learned under normal conditions. Walker 
might have expected this result. 

Although Walker and King and their asso- 
ciates manipulate motivation during the time 
of stimulus input, differential retention must 
be attributed to processes which occur while 
the stimulus is in storage. This interpretation 
is relatively incontrovertible owing to the 
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interaction between time of recall and degree 
of arousal. 


Hypnosis 


The relatively long duration of posthyp- 
notic suggestions and the alleged ability of 
individuals in a hypnotic trance to recall ma- 
terial not available during an ordinary waking 
state (hypermnesia) indicate that hypnosis 
affects the memory process. Kellogg (1929), 
Patten (1930), Weitzenhoffer (1950), and 
Wells (1940) report posthypnotic suggestions 
which remained effective for durations of 3 
weeks to 1 year without any intervening 
practice. Ericksen and Ericksen (1941) note 
cases in which a posthypnotic suggestion was 
carried out following a 5-year delay period. 
In these examples, the unusual availability of 
the trace strongly suggests that motivational 
variables which render that trace less subject 
to interference or decay are operative during 
stimulus input. 

Hypermnesia of early childhood experiences 
has been documented in early clinical cases 
by Freud and Breuer (1924) and Ericksen 
(1937). In addition to clinical reports, there 
has been a series of experimental investiga- 
tions of material recalled during an hypnotic 
state, In these experiments, the motivational 
manipulation occurs at the time of trace 
retrieval; they will be discussed in detail later 
in this paper with other studies demonstrating 
a relationship between motivation and trace 
utilization, 


Attitudes 


There are two distinct groupings of empiri- 
cal investigations which focus upon the influ- 
ence of attitudes or retention. One cluster of 
Studies has investigated the recall of contro- 
versial material which is consonant or dis- 
sonant with the opinions held by the subject. 
The second has examined the retention of 
Personal evaluative ratings by others which 
agree or are discrepant with the self- 
Perceptions of the subject. 

The initial support for the assertion that 
One’s frame of reference influences retention 
Was provided by Watson (1939), Edwards 
(1941), and Levine and Murphy (1943). In 


these studies, attitudes toward controversial 
issues (the New Deal, communism) were 
assesed, subjects read passages which sup- 
ported or contradicted their point of view, 
and retention of the passages was measured, 
The data revealed that subjects remembered 
material which supported their beliefs better 
than material which conflicted with their ex- 
isting attitudes. Alper and Korchin (1952) 
found that following the reading of a passage 
relatively derogatory toward females, males 
recalled more anti-female items than females, 
while females recalled more anti-male items 
than males. Subsequently, Taft (1954) ob- 
tained data which contradicted some of the 
findings of Alper and Korchin. However, he 
did find that, over a period of time, Negroes 
forgot more unfavorable material and recalled 
more favorable material from a passage perti- 
nent to the evaluation of Negroes. This trend 
was not exhibited by white subjects. Garber 
(1955) then demonstrated that both belief 
of a statement and the affective attitude 
toward that statement influence later ability 
to recall the material. 

In summary, there is strong support for 
postulating a linkage between frame of refer- 
ence and recall. Nevertheless, as Henle 
(1955) points out, this divulges nothing 
about how or why such a relationship occurs. 
Henle suggests that the inequalities in reten- 
tion may be due to differences in understand- 
ing, familiarity, attention, degree of structure, 
or intention to use the material. Note that 
the alternatives suggested by Henle primarily 
invoke principles of learning or trace registra- 
tion to explain differences in recall. Recently, 
Fitzgerald and Ausubel (1963) found that 
differential rates of forgetting of a passage 
about the Civil War were eliminated when 
general knowledge concerning the issues was 
reduced. They attribute the differences in 
retention reported above to cognitive (learn- 
ing) rather than to affective factors. Fitz- 
gerald and Ausubel postulate that “other 
side” arguments are forgotten more easily 
because they are not incorporated into any 
organizational or conceptual framework. The 
influence of frame of reference as a motiva- 
tional variable affecting retention therefore 
remains to be established. 
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The investigation of the recall of personal 
evaluations was initiated by Wallen (1942). 
Wallen realized that the study of the reten- 
tion of controversial material was confounded 
because of the inequalities in original learn- 
ing. To overcome this problem, he employed 
a method in which the relevant attitudes 
were unrelated to the amount of prior 
commerce with the content of the message. 
Wallen presented subjects with bogus charac- 
ter evaluations, When testing for later recall 
of these evaluations, he found that the ratings 
tended to be altered to make them compatible 
with the subjects’ opinions of themselves. 
Shaw (1944) replicated this finding and 
demonstrated that fewer errors were made 
in recall when the evaluations of the alleged 
rater were favorable. Later, Shaw and 
Spooner (1945) found greater recall of self- 
other ratings which were in agreement, al- 
though there was no difference in the recall 
of favorable or unfavorable evaluations. 
Kamano and Drew (1961), however, found 
better recall of unfavorable evaluations in a 
test of immediate recall. Thus, there is evi- 
dence that concordant evaluations are re- 
tained better than discordant ratings. The 
data concerning the influence of the posi- 
tiveness of the rating on retention are con- 
flicting, and no conclusions can be reached. 


Incentives 


The majority of studies reviewed thus far 
have been plagued by the learning-retention 
confounding: investigators interested in re- 
tention have not controlled for the degree of 
original learning. If one is interested in reten- 
tion, then purposefully varying conditions 
which are thought to affect learning seems 
to be an especially poor procedure. Yet this 
approach has yielded important data linking 
motivation and retention. Heyer and O’Kelly 
(1949) had subjects learn nonsense syllables 
under different intensities of motivation. One 
group of subjects was told that their perform- 
ance would be included as part of their course 
grade, while a control group was merely asked 
to learn the list. Heyer and O’Kelly found 
no significant differences in the original learn- 
ing of the list, but did find differences in 
retention. Following a 1-week interval, the 


highly motivated group retained more 
the list than the control group. Heyer 
O’Kelly concluded that motivational 
tensities under which habits are original 
learned affect later retention. Owing to 
(unexpected) equality in the degree of origi: 
nal learning, the results provide strong e 
dence that motivation affects retention. (For 
further work in this area using ani 
subjects, see Heyer and O’Kelly, 1951.) 


MOTIVATIONAL MANIPULATIONS DURING 
TRACE STORAGE OR RETRIEVAL 


The essential characteristic of the follow- 
ing investigations is that a motivational 
nipulation occurs following original learni 
that is, during the period of trace storage or 
trace utilization. These few studies provi 
the best test of the hypothesis that moti 
tion affects retention. The experiments 
be presented in detail. 


Incompleted and Completed Tasks 


Caron and Wallach (1957), guided by 
findings of Alper (1946), Atkinson (1953) 
and Ericksen (1952, 1954), reasoned thae 
there is a subgroup of individuals (F group) 
who will exhibit a Zeigarnik effect und 
stressful, or ego-involving, conditions. Simi 
larly, there is a subgroup of individuals (S 


is usually theorized that the F group recalls 
more incompleted tasks because of the per- 
sistent tension system directed toward task 


ferential recall should be reduced by reveal- 
ing that the experiment is a hoax and that 
some of the tasks are insoluble. That is, the 
F group should subsequently recall fewer 
incompleted tasks because the tension system 
associated with an incompleted task is dis- 
charged. Correspondingly, the S group should 
subsequently recall more incompleted tasks. 
The repressed material would no longer be 
perceived as threatening, and should reemerge 
into consciousness, 
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Caron and Wallach employed the usual 
Zeigarnik paradigm, with relaxed and stress- 
ful conditions, Subjects were classiñed into S 
and F groups on the basis of factor analytic 
considerations. Results at the immediate- 
recall period revealed the expected interaction 
between the experimental conditions and the 
individual difference classification. The sub- 
jects were then informed that the experiment 
was “fixed,” and that some of the puzzles 
were insoluble. Two additional recalls re- 
vealed that there was no shift in the pattern 
of recall following this feedback. It was 
therefore concluded that the initial differences 
in recall were due to selective learning rather 
than selective retention. Caron and Wallach 
suggest that differential rehearsal of the ma- 
terial might account for the differences in the 
degree of learning. 


Stress 


Chansky (1956) had subjects engage in 
a serial learning task until the list had been 
mastered. A second list was learned as an 
interpolated task. Following the learning of 
List 2, subjects were asked to recall the 
items in List 1. One group of subjects was 
told that the number of correct items recalled 
would “measure both one’s native intellectual 
capacity as well as certain neurotic tenden- 
cies.” A second group recalled under neutral 
condtions. The results indicated that the 
heightened stress dampened the amount of 
recall, 

In Chansky’s study, the degree of original 
learning of the two groups was almost identi- 
cal, while there was a significant difference 
in recall, It must be concluded that the stress 
manipulation affected certain retention proc- 
esses. The differential recall could be at- 
tributed to differences in the availability of 
the trace or to inequalities in the organism’s 
Potential for trace utilization. It is important 
to observe that Chansky’s result replicates 
Some of the findings discussed previously. 


Affective Experiences and Affective Words 


In his book, A Model of the Mind, Blum 
(1961) presents some exploratory investiga- 
tions demonstrating the influence of affect on 
Memory. Hypnotically-trained subjects were 


asked to recall an overlearned series of Blacky 
picture stimuli. Positive or negative affect was 
then associated with a given Blacky card 
by hypnotically-induced suggestion. Blum 
investigated the effect of this emotional at- 
tachment on the order or sequence of stimu- 
lus recall. An example will best describe the 
procedure: It was observed that in free recall 
one of the subjects invariably recalled Card 
II immediately after Card VII. Under hyp- 
nosis, positive affect was attached to Card II 
and negative affect or anxiety to Card VII. 
On the following free-recall trial, Card II was 
recalled prior to Card VII. Other instances 
are described by Blum which reveal that emo- 
tional loading can dramatically alter the 
recall sequence. 

Blum finds that pictures which are asso- 
ciated with anxiety are generally recalled later 
in the series. However, the anxiety may 
facilitate recall if it is of sufficient intensity. 
His findings tend to corroborate the previous 
studies indicating that affective intensity 
influences recall, and that intensity may be 
a more reliable determinant than affective 
quality. 

In studies utilizing affective words, two 
procedures have been used to select stimuli. 
Either the stimuli are intuitively agreed upon 
as being affective, for example, RAPE, KILL, 
etc.; or specific stimuli which have personal 
affective meaning are chosen for each indi- 
vidual. Clemes (1961; reported in Hilgard, 
1964) used the latter procedure. Following 
Keet (1948), he designated words requiring 
a long latency before eliciting an associative 
response as affective. While under hypnosis, 
subjects engaged in a verbal learning task 
which was in part composed of these trau- 
matic words. After the list was mastered, 
Clemes instructed subjects to forget half 
the words on the next trial. He found that 
the “forgotten” words were the complex- 
related stimuli identified earlier. That is, 
the posthypnotic amnesia predominantly af- 
fected the recall of the affect-laden words. 
With removal of the hypnotic suggestion, the 
entire list was recalled. 

As Walker (1964) points out, this experi- 
ment enables one to separate the process of 
trace storage from trace retrieval. In Clemes’ 
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study, the stimuli were in storage, as revealed 
by subsequent recall, yet the subject was 
unable to retrieve them. The experiment 
clearly demonstrates that motivational ma- 
nipulations can influence retention. 


Arousal 


Early investigations of dynamogenic fac- 
tors influencing recall (e.g., Courts, 1939) 
often had subjects learn verbal material while 
squeezing a dynamometer. Their recall was 
then compared with a group learning without 
any increased muscular tension. Bourne 
(1955) modified this procedure to enable him 
to separate the effect of arousal (drive) on 
learning from the effect of arousal on reten- 
tion. Tension was applied (T) or withheld 
(N) during both the learning and recall 
phases of his experiment. Four experimental 
groups (T-T, T-N, N-T, N-N) were estab- 
lished, and five recall intervals were employed 
between the time of learning and the time of 
recall. Bourne found that differences in recall 
between the T-N and N-N groups, and be- 
tween the T-T and N-T groups, were not 
significant at the longest time interval of 4 
minutes. That is, tension applied during learn- 
ing had no effect on recall when tension 
during the recall phase was controlled. On 
the other hand, there were significant dif- 
ferences in recall between the N-T and N-N 
groups, and between the T-T and T-N 
groups. Recall under tension was better than 
recall under no tension when the degree of 
arousal during original learning was con- 
trolled. Bourne argues that previous differ- 
ences found in recall when tension was ap- 
plied during learning were caused by the 
tension persisting until the recall period. His 
data tend to confirm this interpretation: over 
time the differences in recall between groups 
learning under tension and no tension steadily 
decreased. After 4 minutes, the tension ap- 
plied during learning was expected to be 
completely dissipated; at this point recall 
was almost identical between groups learning 
under the different conditions. These data lead 
Bourne to conclude that induced tension 
facilitates response elicitation (trace utiliza- 
tion), but does not alter habit strength. 


Hypnosis 


Hypnotic experimentation on the capacity 
of individuals to recall previous events 
an early and relatively brief history. H 
(1930) and Mitchell (1932) had subjec 
learn nonsense syllables in a waking sta 
and recall in either a waking or hypno 
state. These investigators found no difference 
in recall as a function of the conditions during 
trace retrieval. Stalnaker and Riddle (1932) 
asked subjects to recall a poem learned earlier 
in life, and found greater remembrance for 
the subjects in an hypnotic trance at the time 
of recall. White, Fox, and Harris (1940) 
hypothesized that the contradictory results 
of the prior studies were due to differences 
in the meaningfulness of the material to be 
recalled. They found that during an hypnotic 
trance the recall of meaningful material was 
faciiltated, while this was not true for non- 
sense syllables. 

The most thorough study of the effects of 
hypnosis on recall was undertaken by R 
thal (1944). He had subjects learn material 
during a waking state, recall while waking, 
and recall again during either a waking oF 
trance condition. In a series of experiments, 
the material was varied to include nonsense 
syllables, poems, profane words, verbal lists 
associated with success and failure, an 
completed and incompleted tasks. Rosenthal 


is relatively free from anxiety, and, therefor 
does not repress unpleasant experiences. 
Again, in these experiments, the degree 0) 


(equal). The differences in recall are to 
attributed to retention processes which 
influenced by the hypnotic trance. 


Current Research Employing Incentives 


During the last 5 years there has been 2 
voluminous growth in the study of short-term 
memory (Melton, 1963). Associated with 
growth, and at least in part responsible for it, 
has been the development of sophisticated 
methodological innovations (e.g., Averbach 
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Corelli, 1961; Lloyd, Reid, & Fealloch, 1960; 
Peterson & Peterson, 1959). Weiner and 
Walker (1966) adopted the Peterson and 
Peterson technique in a study relating moti- 
yation and memory. Consonant trigrams cued 
for four different incentive conditions were 
used as stimuli. The background color, on 
which the stimuli were flashed, provided the 
discriminative signal. Four incentive condi- 
tions were used: 1 cent reward for correctly 
recalling the stimulus; 5 cents reward for cor- 
rectly recalling the stimulus; shock for not 
recalling the stimulus; and a control condition 
where neither shock nor money was at stake. 
Two recall time intervals were employed: 
4.67 seconds and 15 seconds. During the 
recall interval, subjects read digits in time 
to a metronome beating 3.25 times per sec- 
ond, The results revealed that there was an 
interaction between the motivational condi- 
tions and the interpolated time interval. At 
the shorter interval, there were no significant 
differences in recall; while at the longer in- 
terval, stimuli cued for 5 cents and shock 
were recalled significantly more than stimuli 
cued for either 1 cent or nothing. Because 
there were no significant differences in recall 
at the shorter time interval and differential 
decay rates were found for identical stimuli, 
differences in recall were interpreted as dif- 
ferences in retention (trace storage) rather 
than differences in the degree of original 
learning. The finding that heightened motiva- 
tion during stimulus input enhances subse- 
quent recall has been replicated in five other 
experiments at the University of Minnesota 
Center for Personality Research (Kernoff, 
Weiner, & Morrison, 1966). 

One explanation of these results is that 
heightened motivation during learning makes 
the mnemonic trace less subject to proactive 
or retroactive interference. Marrow (1938a) 
found the Zeigarnik effect greatest in the 
middle of the serial order of tasks, where 
interfering tendencies are maximal. Prentice 
(1943) reported that differences in recall be- 
tween material learned under heightened as 
Opposed to normal motivating conditions in- 
creased when subjects were given interpolated 
activity prior to recall. In a recent study, the 
Weiner and Walker (1966) experiment was 


modified by increasing the difficulty of the 
interpolated activity, Subjects were required 
to read pairs of digits, add them, and classify 
the sums as odd or even prior to the time 
of recall. Posner and Rossman (1965) have 
demonstrated that this interpolated activity 
markedly reduces the amount of retention. 
Results again indicated an interaction be- 
tween the incentive condition and time of 
recall. More important, however, was that 
the differences in recall between the motiva- 
tional and control conditions were greater 
than the differences found using the easier 
(Weiner and Walker) interpolated task. This 
finding suggests that the influence of motiva- 
tional variables on retention will be most 
evident in situations which are conducive to 
forgetting. 

In the Weiner and Walker study the tempo- 
ral order of events was: 


incentive cue 
— interpolated activity — recall 
stimulus 


Two other temporal sequences also have been 
employed in the Center for Personality Re- 
search studies: 


(a) stimulus —> 
incentive cue 
interpolated activity > 


(b) stimulus — interpolated activity 
— incentive cue — recall 


In the latter experiments, the level of motiva- 
tion is manipulated immediately prior to or 
during the period of trace retrieval, rather 
than during trace formation or trace storage. 
Thus far, four studies have failed to yield 
significant differences in recall as a function 
of the magnitude or quality of incentives. 
Presently, investigations are underway 
which vary the magnitude and type of incen- 
tive, the difficulty of the interpolated activity, 
and the point in the memory sequence at 
which the incentive cue is presented. Other 
methodologies, such as the running memory 
span (Lloyd, Reid, & Feallock, 1960), are 
also being modified to study the influence of 
motivational factors upon memory. 
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BIRTH ORDER AND SOCIAL BEHAVIOR 


JONATHAN R. WARREN 
College Student Personnel Institute, Claremont, California 


Recent research of Schachter has redirected a long-standing interest in physio- 
logical, psychological, and sociological correlates of order of birth to affiliative 
or withdrawal tendencies as birth-order correlates. The most firmly established 
and persistent finding relative to birth order shows an overproportion of 
firstborn children in college. Substantial evidence also exists showing (a) first- 
born to be more susceptible than later born to social pressure and (b) firstborn 
to be more strongly attracted than later-born 
women to the company of others. Studies relating birth order to conformity or 

» delinquency, alcoholism, and schizophrenia as well as college at- 
tendance and affiliative behavior are reviewed. 


women, when a; 


` Order of birth and its consequents have 
been of interest intermittently to psycholo- 
gists for almost a century, and perhaps longer 
(e.g., Galton, 1874), During that period, it 
has been examined in relation to prominence, 
intelligence, delinquency, and physical and 
mental illness (Chen & Cobb, 1960; Ellis, 
1904; Jones, 1931; Rosenow & Whyte, 1931; 
Sletto, 1934; Thurstone & Jenkins, 1929). 
Recently, Schachter (1959) focused attention 
on birth order as a determinant of social af- 
filiation or withdrawal and such related phe- 
nomena as college attendance, alcoholism, and 
schizophrenia.’ 

Schachter observed that, under stress, first- 

college girls tended to seek the company 
of others while later-born girls tended to 
withdraw. He then found that a general ex- 
Pectation that, under stress, firstborn would 
prefer affiliation while later born would tend 
to withdraw led to accurate predictions of 
combat flying effectiveness and incidence of 
alcoholism. A variety of studies that have 
appeared since Schachter’s book provide con- 
fusing evidence about the association between 
birth order and affiliative behavior., 

Many of the studies that followed Schach- 
ter’s book and some of those that preceded 
it but bear on affiliation will be reviewed, 
Attention will be given to college attendance, 
affiliative behavior, conformity or dependence 
delinquency, alcoholism, and schizophrenia as 
forms of social behavior.*Sources of uncer- 
tainty in the studies—the definition of birth 
order, the presumed origin of birth-order ef- 
fects, the type of effect studied, and sex will 
also be examined.) 


DEFINITION OF BIRTH ORDER 


(In a deceptively simple form, birth o! 
is the sequential position of a person am 
his or her siblings with respect to order 
birth. Complications occur, however, in vat 
ous modifications or “simplifications” of 
basic definition, and some additional refint 
ment is necessary, Some investigators com 
pare firstborn with all later-born subjects 
Others compare eldest with youngest. Sti 
others compare those born in the first hal 
of the sibship with those born in the secon! 
half. Some investigators exclude only childré 
from their samples, Others consider only chit 
dren to be firstborn and treat them acco 


in quite different types of studies, color th 
way birth order is defined. Investigate 


| The second major presumption, implied Ï 
many of the articles reviewed, is that bir 
order effects have their origins in family inte” 
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actions that differ for different children in 
the same family. The birth of a younger sib 
may have quite different consequences for a 
first child than for a second child. The sexes 
of both children also seem to play a major 
role in family interactions.! 

Those who study birth order from the 
family interaction frame of reference are 
more interested in the sexes of the sibship 
members than are those concerned with a 
physiological frame of reference. Age of the 
mother is of minimal interest.) Social class 
of the family should be a concern of both 
the physiologists and the family interaction- 
ists, since it is related to prenatal care and 
dietary variables as well as to family inter- 
action variables., 

‘A number of potential sources of physio- 


situations that later-born children can be 
expected to face with more aplomb than first- 
born! |The ancient rules of primogeniture 
still seem to put firstborn children into dif- 
ferent positions relative to society than later 
born! In rural areas, the first son is still often 
the one to stay on the farm, while later sons 
enter other occupations. Beyond the farm 
situation, first sons may have first call on 
family finances in going to college, although 
evidence for this is lacking. 

i Each of these possible sources of birth- 
order effects suggests a somewhat different 
definition of birth order.{ Interest in physio- 
logical variation with birth order requires the 
counting of all pregnancies, including those 
terminated unsuccessfully, and suggests cumu- 
lative effects with each later order of birth, 


logical variation associated with birth order Sf Some assumptions about family interaction 


can be found. The intrauterine environment 
provided by the mother may vary with the 
mother’s age and with the number of previous 
pregnancies, both associated with birth order. 
Perinatal influences may also operate on 
physiological characteristics. Length of labor 
and frequency of the use of forceps in deliv- 
ery, for example, are associated with birth 
order (Weller, 1965).) 
‘Differences in family environment for 
second children as opposed to first children 
Seem generally acknowledged. Firstborn, for 
Some period of their life, have only adult 
models in their immediate family and are 
free of competition from siblings for parental 
attention. Later-born children find more 
Competition for parental attention and have 
older siblings as well as adults available as 
Models. The attitude of the mother toward 
the child tends to be more relaxed, less anx- 
lous with later-born children than with first- 
born (Lasko, 1954; Sears, 1950). Only and 
youngest children are alike in not having the 
experience of superiority with respect to a 
Smaller, less-experienced sibling that other 
children have, 4 
!Throughout childhood, later-born children 
tend to have already experienced vicariously 
through older siblings the movement into 
new and potentially stressful situations.@Be- 
sid 8 to associate with other children out- 
e the family, going to school, learning to 
, and having a first date are examples of 


processes suggest comparisons of firstborn 
with all other children. Others would lead 
to comparisons of only and youngest children 
with all others. Sex patterns of sibships and 
age differentials within the sibship should also 
have their effects on family and social de- 
terminants of personal characteristics (Koch, 
1957)! 

Í Studies in which birth order is defined only 
as oldest versus others or as first half of the 
sibship versus second half or in some other 
limited way can be fruitful, particularly in 
the aggregate. But clear understanding of the 
nature and origins of birth-order effects, or 
simply of concomitants of birth order, cannot 
be developed without refinement of definition. 
All positions of birth, some index of the age 
relationships, and probably some attention to 
the sex pattern in the sibship should all have 
a place in an adequate operational definition 
of birth order (e.g., Waldrop, 1965) J 


CORRELATES OF BIRTH ORDER 


Schachter (1963) has reviewed earlier 
studies of relationships between birth order 
and eminence and between birth order and 
intelligence. He concluded that eminent peo- 
ple are far more likely to have been eldest 
or only children than to have been later-born 
children. The evidence for a relationship be- 
tween birth order and intelligence, however, 
is equivocal. Some studies were found that 
reported an intelligence advantage for first- 


i 
| 


BHR HMI R R maniant 
Hia Hipiai maiin 
THAR 43 H Er saps ptapes Py i inn 
ii Cae 
F FHE H $i RE 
a i silane il 
i, egret HE R mn ori 
i H Tt a Hini a iit 
ae HEP H satil hihi Ht Hit i iti 
ij HI iint fit il : TIHE jil if Hih ihi 
EE HEFG Tan TBH Ht uy 


BIRTH ORDER AND SOCIAL BEHAVIOR 


gegerdics: of level of stress. The affiliative 
fmdirecy sems to arise at least in part from 
Ta desire for comparative information about 
T esli, Data were not presented on any in- 
fraction effect of birth order and possession 
al information on the desire to be with others, 
| of Part of Schachter's position is supported In 
the Gerard and Rabble data. Pirstborn 
Seren, under stress, do express stronger dè- 
gies to be with others than do later-born 
women. The tendency toward withdrawal of 
Meter born, suggested by Schachter, is not 
sepported. “Almost all the women chose to be 
‘With others. 

Radloff (1961) manipulated the flow of in- 
formation among subjects to make some 
Wedergraduate women believe their opinions 

| @ a particular issue deviated from those of a 
) Berge majority of their peers. Another group 
Were made to believe their opinions were con- 
Stent with those of their peers. All the girls 
‘then reported the strength of their desire to 

T engage in a discussion of the topic in ques- 
‘ton. Among firstborn subjects, those in con- 

C ft with their peers expressed a stronger de- 
Sire for further discussion than did those in 
agreement with their peers (p < .001). Among 
Mater-born subjects, the corresponding differ- 
“xe, while in the same direction, was not 
‘Senificant. Firstborn and later-born groups 
‘Were virtually equal in desire for further dis- 

Mesion when the two conditions of conflict 
Were combined. A significant interaction (f 
S01) between conflict and birth order on 
“desire for further discussion was interpreted 
@ a further indication of the role of birth 
‘Grider in affiliative behavior. 

Apparent conflict between one’s own opin- 
fons and those of others seems to have a 
Mronger effect on firstborn than on later born. 

k But part of the difference Radloff found be- 
— firstborn who were in conflict and first- 
Barn who were in agreement with their peers 
"as due to a relatively low desire for further 
i. among the firstborn in agreement 
“With their peers, an unexplained phenomenon. 
‘The crucial test of the influence of birth 
Order on affiliative behavior when in conflict 
others would compare the strength of the 

tive desire of firstborn and later born 

the condition of opinion conflict. That 

, made by the author using Rad- 
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order and stress is whether either variable by 
itself is associated with affiliative tendencies. 
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j In four studies (Conners, 1963; Dember, 
1964; Sampson, 1962; and Staples, cited in 
Staples & Walters, 1961) the variable of in- 
terest was affiliative imagery produced in re- 
sponse to TAT pictures in presumably stress- 
free situations. In all four studies, firstborn 
subjects produced higher nAff scores than did 
later born, but the results are cloudy: 

Conners (1963) studied undergraduate men 
separated into groups of only children, first- 
born with siblings, and secondborn. Firstborn 
with siblings did not differ significantly from 
secondborn, but only children had higher nAff 
scores than either of the other groups. The 
appropriateness of including firstborn and 
only children in the same group is thus 
brought into question. 

Dember (1964) found higher nAff scores 
for firstborn than for later-born women. His 
results were not significant for men, although 
they were in the expected direction. Lack of 
significance may have been due to the small 
number of male subjects (n = 16). 

Sampson (1962) found higher nAf scores 
among firstborn undergraduates of both sexes 
than among later born, but his sample of 61 
was too small for the differences to be signifi- 
cant except for the total group. Staples and 
Walters (1961) cited an earlier unpublished 
study by Staples in which firstborn subjects 
produced higher nAf scores than later-born 
subjects. Details were not given, however, 
and the size and sex composition of his sam- 
ple were not reported. 

The studies based on TAT measures are 
consistent in suggesting that firstborn have 
stronger needs for affiliation than do later-born 
subjects. Each of these is limited in some re- 
spect, however, and the relationship between 
birth order and affiliative imagery is not yet 
clearly defined. The tendency for firstborn to 
produce higher nAff scores may be attributa- 
ble wholly to only children, 

Stress has been shown to lead to affiliative 
behavior regardless of birth position. Gerard 
and Rabbie (1961) split their sample and 
provided two levels of fear. Half their sub- 
jects expected an electric shock that would 

~ be quite painful. The other half were led to 
expect only a mild shock, Regardless of birth 
position or amount of information given the 
subjects, those in the high-fear condition ex- 


fear condition. 

Sarnoff and Zimbardo (1961), without 
gard for birth order and using undergradua 
men as subjects, established four conditions 
of stress—high and low fear and high 
low anxiety. Fear was induced by leading 
subjects to expect electric shock in an impend- 
ing psychological experiment. Anxiety w 
induced by leading the subjects to ex 
that they would be required to suck on a 
variety of objects such as nipples, pacifie 
and candy suckers, thus arousing repres 
oral libido and therefore anxiety. As in the 
Gerard and Rabbie study, subjects coul 
await the threatening experiment alone 
with others and indicated the strength of thei 
desire to wait in either situation, The desire 
to wait with others was strongest in the high- 
fear group and lowest in the high-anxiety 
group. Of the 20 subjects in all four experi- 
mental conditions who chose to wait alone, 12 
were in the high-anxiety condition and onl 
1 in the high-fear condition, 

Fear was clearly related to affiliative desire 
in both Gerard and Rabbie’s and Sarnoff and 
Zimbardo’s studies, Sarnoff and Zimbardo, 
however, presented convincing evidence that 
the nature of the stress determines its effect 
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Staples and Walters (1961), studying sus- 
ceptibility to group pressure as a function of 
birth order, induced stress in their subjects 
by convincing them they were likely to re- 
ceive an electric shock during the course of an 
autokinetic experiment. The emotion produced 
by fear of shock was called anxiety, and birth 
order and stress interacted in their effect on 
susceptibility to group pressure. Firstborn 
subjects apprehensive about being shocked 
perceived greater movement in the autokinetic 
situation and perceived it more quickly than 
did later born in either a shock or no-shock 
condition, or firstborn in a no-shock condi- 
tion. Staples and Walter’s subjects were all 
female, and their findings are thus similar to 
those for Gerard and Rabbie’s female sub- 
jects, with suggestibility substituted for affilia- 
tive desire. The stress that Gerard and Rabbie 
called fear, however, Staples and Walters 
called anxiety. 

Sarnoff and Zimbardo’s study, in distin- 
guishing between fear and anxiety, suggests 
the need for care in establishing the nature of 
the stress being applied to the subjects. The 
same situation, moreover, may have different 
emotional effects on men and women, which 
may account for some of the sex differences 
found in affiliative tendencies under stress. 

Gerard and Rabbie directly, and others in- 
directly (Radloff, 1961; Wrightsman, 1960), 
provide evidence that the greater affiliative 
behavior of firstborn relative to later born, 
when it occurs, is an information-seeking 
form of behavior, Lack of relevant informa- 
tion in any situation in which a person is di- 
rectly involved can be considered anxiety pro- 
ducing. While Radloff’s findings are difficult to 
interpret, the conditions under which affiliative 
behavior occurred were conditions involving 
lack of information. Wrightsman found ten- 
Sion to be reduced when subjects were allowed 
to converse but not when they were together 
without conversation. 

To this point, then, firstborn women seem 
More sensitive than later-born women to 
Stressful or tension-producing situations. 
While almost all women under stress prefer 
the company of others to solitude, firstborn 
Women exhibit stronger desires for company 
than do later-born women. Whether the de- 
Site for the company of others can most use- 


fully be interpreted as a desire for informa- 
tion or as a purely affiliative need has not yet 
been settled. Dissonance theory (Festinger, 
1957) or balance theory (Heider, 1958) 
would lead to predictions of the kind of affilia- 
tive behavior that has been demonstrated, 
but would emphasize its information-seeking 
rather than affiliative nature. Neither theory, 
though, accounts for the birth-order effect 
found for women. 

Among women, stress by itself leads to 
affiliative or information-seeking behavior. 
Also among women, stress and birth order 
interact in their effect on affiliative behavior, 
firstborn under stress indicating a stronger 
affiliative desire than later born. 

Whether birth order is related to affiliative 
behavior in the absence of stress is still an 
unresolved question, Some investigators have 
found Aff scores related to birth order 
without stress operating. Others, in experi- 
ments in which stress was manipulated, found 
no relationship between birth order and affil- 
iative behavior in the no-stress condition. 

Among men, the effect of stress on affilia- 
tive behavior seems to depend on the nature 
of the stress, fear leading to affiliation and 
anxiety leading to withdrawal. The confound- 
ing of fear and anxiety may have prevented 
clear-cut findings on the relationship between 
birth order and affiliative behavior for men. 


Conformity-Dependence 


] Several studies have probed the relationship 
between birth order and behavior that could 
be classified as either conforming or depend- 
ent. 

Some time ago, Sears (1950) described 
tentative evidence that firstborn children were 
more dependent than later born. Mothers, in 
one study, described their firstborn children 
as more dependent than their second chil- 
dren. Teachers, in another study, rated first- 
born children as more dependent than later- 
born. The numbers involved in both these 
studies were small and Sears was reluctant to 
draw any but tentative conclusions. 

Dittes (1961) and Schachter (1964) have 
both reported firstborn to be more susceptible 
to social pressure than later born. Dittes ma- 
nipulated the degree of peer acceptance and 


H 


found later-born subjects to be virtually un- 
affected while the behavior of firstborn varied 
widely with variations in the regard they felt 
others had for them. Schachter found some- 
what similar results in the natural setting of 
college fraternities and sororities. Firstborn, 
more than later-born, students preferred to 
associate with popular peers. Thus firstborn 
chose their associates more in conformity with 
normative choices than did later born. The 
autokinetic study by Staples and Walters 
(1961), previously cited, also indicates that 
firstborn are more responsive to the sugges- 
tions of others than are later born. 

Becker, Lerner, and Carroll (1964) were 
able to reverse the relationship between birth 
order and the amount of yielding in an Asch 
situation by varying the amount of money at 
stake, The subjects were 15- and 16-year-old 
caddies who were given no reward, 5 cents, or 
25 cents for correct judgments about the rela- 
tive lengths of two lines. Each subject made 
his judgments while believing others had 
made judgments contrary to the true situa- 
tion. Becker, Lerner, and Carroll hypothesized 
that firstborn would be more susceptible to 
normative pressure from others—pressure to 
conform for the sake of being in tune with 
others—but that later born would be more 
responsive to the behavior of others that had 
informative value. The no-reward condition 
was considered one in which the behavior of 
others was largely normative, conveying little 
information. The high-reward (25 cents) con- 
dition was considered informative, since the 
others were presumably risking something 
of value and therefore their judgments should 
be considered. The nickel-reward condition 
was intermediate and the no-reward condition 
low in informative value. According to the 
investigators’ interpretations, then, the in- 
formative nature of the behavior of the others 
was varied through three values—low, inter- 
mediate, and high. The normative nature of 
the three risk conditions presumably remained 
constant, 

The results were as hypothesized. Firstborn 
made more errors than later born in the no- 
reward condition; later born made more er- 
rors than firstborn in the high-reward condi- 
tion. This was in part a replication of an 
earlier study in which only the no-reward 
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condition was used (Becker & Carroll, 1962), 
Unexplained in the later study, though, is the 
finding that firstborn made more errors in the 
no-reward condition than in the other condi- 
tions. Nothing in the theory presented would 
predict a decrease in conforming behavior by 
firstborn subjects when the normative nature 
of the social pressure remained constant and 
its informative nature increased. The arti- 
ficiality of the situation and the complex and 
unexplained changes in behavior make the 
results difficult to interpret with any feeling 
of confidence. 

Three other studies (Arrowood & Amoroso, 
1965; Radloff, 1961; Sampson, 1962) have 
been reported that deal with birth order and 
susceptibility to group pressure, but each of 
these presents some difficulty in interpreta- 
tion. Only one study other than that of 
Becker, Lerner, and Carroll has been found in 
which firstborn were not more susceptible 
than later born to social pressure. Greater 
responsiveness of firstborn to social pressure 
has been found in studies with men only, 
with women only, with the sexes mixed, and 
from the early school years through college. 

The one deviant study (Sampson, 1962) 
suggested that firstborn undergraduate women 
may be more resistant than later born to the 
influence of a presumed expert on a topic. But 
the conditions of the experiment were highly 
artificial and the influence of the girls on 
each other was an unknown factor, In resist- 
ing the influence of the expert, the girls may 
well have been responding to the influence of 
their own group, 

A more serious challenge to the conclusion 
that firstborn are more sensitive than later- 
born children to social pressure is found in a 
study of the effectiveness of verbal approval 
as a reward (Walters & Ray, 1960). First- 
born, if they are more responsive to social 
pressure, should respond more strongly than 
later born to verbal expressions of approval. 
Improvement in performance on a task after 
verbal approval, however, was not related to 
birth order. Inducing a state of mild anxiety 
in the subjects, who were first- and second- 
grade boys, did result in greater improve- 
ment of performance under conditions of 
verbal reinforcement regardless of birth order. 
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Volunteering 


A study that prompted others to give atten- 
tion to birth order was one in which firstborn 
were found to be somewhat overrepresented 
among volunteers for a psychological experi- 
ment (Capra & Dittes, 1962). The overpro- 
portion of firstborn, however, is only on the 
edge of statistical significance and involves a 
fairly small n. While 22 of 29 volunteers were 
firstborn, they were drawn from a pool of 100 
students of whom 61 were firstborn. Chi- 
square corrected for continuity does not 
reach the .05 level of significance. 

Capra and Dittes speculated that the na- 
ture of their experiment, one involving small 
groups in a cooperative task, attracted first- 
born more than later born. Suedfeld (1964), 
however, found virtually the same proportions 
of firstborn and later born among 29 volun- 
teers for an isolation experiment. The pro- 
portions of firstborn and laterborn in the 
pool from which the subjects were drawn are 
not known. Varela (1964), working in Uru- 
guay with a small number of high school stu- 
dents of both sexes (n = 66), found firstborn 
volunteering at a higher rate than later born 
for a small group experiment. 

Wolf and Weiss (1963) and Ward (1964) 
checked the findings of Capra and Dittes 
with larger samples and with different types of 
experiments for which the subjects could vol- 
unteer. Volunteering rates of firstborn and 
later born did not differ significantly for either 
Sex, whether volunteering was for a group ex- 
periment or for an experiment in which the 
Subject was alone with the experimenter, and 
Whether Participation in an experiment was 
Tequired or not. 

Any relationship between birth order and 
Volunteerin, if it exists, has not been clearly 
established The nature of the stress in the 
p pateering situation may have varied suffi- 
En from et study to produce con- 
aaa results™In view of the findings on 

ence and affiliation, birth order can be 
ose to influence volunteering behavior 
ere is social pressure to volunteer.\/ 


Identification and Empathy 


a hae and his colleagues have conducted 
tles of studies on identification with others 


and on feelings of empathy with others, and 
have included birth order as a parameter” In 
one study (Stotland & Dunn, 1963), later 
born tended to identify with a model to a 
greater degree than firstborn and only chil- 
dren. In this and in other studies by Stotland 
and his associates, identification is defined as 
the attribution to one’s self of characteristics 
observed in a model. They speculate that 
later-born children have grown up with avail- 
able models close to them in age and other 
attributes. Firstborn and only children, de- 
pending on adults as models, tend to identify 
less because of the obvious disparities be- 
tween themselves and their models. 

In another study (one difficult to interpret 

because of the complexity and artificiality of 
the experimental conditions and the small 7’s), 
later-born subjects seemed slightly more re- 
sponsive than firstborn to the influence of a 
model (Stotland & Cottrell, 1962). A third 
study, also rather complex, was interpreted 
to suggest that later born tend to empathize 
more readily than firstborn (Stotland & Dunn, 
1963). Empathy in these studies is considered 
a special case of identification, one in which 
the subject takes on the emotional tone of the 
model. Later-born subjects rated themselves 
more anxious after observing a model in a poor 
performance than after observing a model in 
a good performance. Firstborn and only chil- 
dren did not differ across the two kinds of 
model performances in their self-ratings of 
anxiety. A replication (Stotland & Walsh, 
1963) did not produce the same results. The 
differences in ratings of anxiety that were as- 
sociated with birth order were not interpret- 
able as showing greater empathy among 
later born than among firstborn. 
./The tenuousness of the results reported 
and the difficulties of interpretation intro- 
duced by the complexity of the experimental 
conditions leave little room for confidence in 
any conclusion relating birth order to tend- 
encies toward identification or empathy. 


Delinquency 

Two early studies of delinquent or problem 
behavior gave attention to birth order. In the 
past 30 years, though, birth order as a factor 
in antisocial behavior has apparently been 
ignored. 
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In a study that treated birth order in great 
detail, 786 delinquent boys listed in juvenile 
court records were matched, with respect to 
age and number of siblings, with 786 nonde- 
linquent school boys in the same city (Sletto, 
1934). The proportions of delinquents among 
firstborn and later-born children (with only 
children excluded) were not significantly dif- 
ferent. The highest proportions of delinquents 
were found among second children in three- 
and four-child families. The lowest propor- 
tions were among third children in three- and 
four-child families. When only families of 
three or more children were considered, sec- 
ond sons more often than third sons were de- 
linquent (# < .005). The position of second 
son in families of three or more children con- 
tributes proportionally more delinquents than 
does any other sibling position (p< .01), 
and the third-son positions contributes pro- 
portionally fewer than does any other (p < 
.02). While social class would be expected to 
differ for the delinquent and nondelinquent 
groups, any effect this might have on the re- 
lation between birth order and delinquency 
in this study is indeterminable. 

These are post hoc analyses and cannot 
lead by themselves to justifiable conclusions 
about birth order and delinquency, Moreover, 
differences between second- and third-birth 
position in three- and four-child families offer 
difficult problems of interpretation. 


r suggesting, 

birth order merited further study asa factor 
in problem behavior. No evidence has been 
found that their suggestion was followed. 
Alcoholism 


One of Schachter’s hypotheses was that 
later-born children would become alcoholic 
more commonly than firstborn, since alcohol- 
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ism may be considered a withdrawal response 
to stress. He then presented data that sup- 
ported his hypothesis (Schachter, 1959), 
Smart (1963) questioned Schachter’s conclu- 
sions on the grounds that his sample was a 
particular subclass of alcoholics—those con- 
victed of crimes associated with alcohol, pri- 
marily public intoxication, that his data were 
related to size of family but not to birth 
order, and that allowance was not made for 
the smaller number of people found in large 
families, Smart examined the data on birth 
order and family size for 242 alcoholics from 
three clinics in Ontario. The ratio of men to 
women was 10 to 1, He found no overrepre- 
sentation of any birth-order position. 

Smart did find a regular increase, with in- 
creasing family size, in the number of alco- 
holics relative to the number of persons in 
each family. He suggests that alcoholism may 
increase with family size as a result of frustra- 
tion of dependency needs in large families. He 
also suggests that social class may mediate 
between alcoholism and family size, large 
family size and alcoholism both being as- 
sociated with lower social class. 

To check the possibility that any overrepre- 
sentation of youngest children among alco- 
holics might be due to the presence of only 
one parent in families of lastborn children, 
de Lint (1964) examined the records of 276 
women admitted to the Addiction Research 
Foundation Clinic in Toronto during the years 
1951 to 1962. Lastborn were greatly over- 
represented among those raised since the age 
of 5 by only one parent. Among those raised 
by both Parents, firstborn and lastborn were 
equally represented. 


Schizophrenia 


Following Schachter’s suggestion that later- 
born children react to stress by withdrawal, 
Schooler (1961) looked for a higher incidence 
of schizophrenia among later born. Among 
seven studies Schooler reviewed that related 
birth order to schizophrenia, three supported 
the hypothesis that later born are more prone 
to schizophrenia, three showed a nonsignifi- 
cant trend in that direction, and one showed a 
nonsignificant trend in the reverse direction. 

In a sample of hospitalized schizophrenic 
women (7 = 120), Schooler found a nonsig- 
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nificant difference between the numbers of 
firstborn and lastborn, but significantly more 
schizophrenic women were born in the last 
half of their families, with respect to ordinal 
position, than in the first half (p < .05). 
Among catatonic patients, three times as 
many were later born than firstborn, and the 
same proportions appeared in a sample of 
the female patients discharged since 1900. 
Later-born children outnumber firstborn in the 
general population by almost two to one, 
however, so the significance of Schooler’s 
findings is questionable. 

Schooler did demonstrate that the mothers 
of later-born children are, in general, older 
than the mothers of children in the early 
birth-order positions, pointing up the con- 
founding of birth order and maternal age in 
every study reviewed. The age of the mother, 
moreover, can be considered important 
Whether birth-order effects are considered 
primarily physiological or due to family in- 
teraction. 

Hospital records of 309 male and 415 fe- 
male schizophrenics showed more lastborn 
than firstborn women (p < .001). The differ- 
ence among males was not significant. The last- 

patients seemed to have lower social 
competence. Relative to firstborn, more last- 

Tm patients were single; their occupational 

el was lower; the occupational level of the 
Spouses was lower; they had had fewer years 
of school; they had more hallucinations, more 
frequent depersonalization, and more suicidal 
tendencies (Schooler, 1964). In another 
Study, schizophrenic women who were in the 

t half of their sibling groups were less 
Social than those in the first half, Contrary to 
_ Schachter’s results, though, in which first- 
i ™ and only children did not differ, Schooler 
Ound firstborn to be most sociable, only chil- 
ten least sociable (Schooler & Scarr, 1962). 


DISCUSSION 


The evid i ionshi 
Bice ence in support of relationships 


ho birth order and volunteering, identi- 
Breas,’ delinquency, alcoholism, and schizo- 

a Is tenuous. It seems strongest in re- 
ard to schizophrenia but even here it is 
ice The confounding with birth-order 
E of the effects of the mother’s age and 
= eased proportions of one-parent families 


among later born further clouds the picture. 
| After considerable attention to birth order 
in its relationship with social behavior, only 
two or three conclusions have found substan- 
tial support. Overwhelming evidence indicates 
that firstborn of both sexes attend college in 
relatively greater numbers than later born. 
Firstborn of both sexes are more susceptible 
to social pressure and are more dependent 
than later born. Firstborn women, when ap- 
prehensive, desire the company of others more 
strongly than do later-born women. 

| The greater college-going rates of firstborn 
may be a response to social pressure. If so, 
the overproportion of firstborn in college can 
be expected to have increased during recent 
years and to be greater among students with- 
out college-educated parents than among 
those whose parents had entered college. 
These hypotheses rest on the assumptions that 
social pressure toward college attendance has 
increased in recent years and that elements 
other than social pressure play a relatively 
greater role in college attendance when the 
students’ parents have attended college. One 
such element is the greater value placed on 
abstract concepts and their more frequent use 
in homes of college-educated parents than in 
other homes. } 

Among University of Nebraska freshman 
men in the College of Agriculture, the over- 
proportion of firstborn relative to later-born 
sons holds only in families in which the father 
had not entered college (Warren, 1963), In a 
sample of freshman men representative of the 
whole university, however, an overproportion 
of firstborn appears for all levels of father’s 
education (Warren, 1964). Data on trends in 
the overproportion of firstborn in college have 
not been examined, although the Columbia 
College data reported by Schachter (1963) 
would show such trends if they exist. 

Unexamined except in a cursory way in this 
paper are the prenatal and perinatal corre- 
lates of birth order. Weller (1965), in a care- 
ful study of 40 white infants between 2 and 
5 days old, found the infants with siblings 
exhibiting higher levels of arousal than first- 
born infants. He suggests that such a basic 
variable as arousal level, because of its asso- 
ciation with generalized drive, alertness, and 
reactivity to stimuli, should have profound 
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effects on the shaping of behavior patterns, 
Waldrop (1965), however, found that new- 
born infants in large families with closely 
spaced siblings were more lethargic than those 
in smaller, less dense families, a finding mildly 
at variance with Weller’s. 

* More careful study of birth order than has 
been demonstrated in the studies reviewed, 
as Sears suggested 15 years ago, seems desir- 
able. Correlates of birth order at all age 
levels, from before birth well into adulthood, 
should be studied for an adequate under- 
standing of birth order.’ 

The defini of birth order should not 
be haphazard; /only children and firstborn 
with siblings d be differentiated) Some 
measure such as the Family Size and ensity 
Index (Waldrop, 1965), in which account is 
taken of the spacing of children in a family, 
would improve the specificity of birth order 
as a concept. 

Adaptation-level theory (Helson, 1964) 
may be useful in integrating the neonatal and 
adult correlates of birth order as well as its 
physiological and sociological correlates. If 
adaptation level rises with later birth posi- 
tions, the greater social responsiveness of 
firstborn college students and the greater leth- 
argy of later-born infants in large families 
would both be 


however, would suggest a higher adaptation 
level among firstborn children. Birth-order 
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simis wed by S during learning (the 
femctiona! stimulus) may be quite different 
fem the stimulus Æ presents (the nominal 
Mimulus). This distinction should be kept 
fe misd for it will help to understand the 
masser in which several variables affect B-A 
-TY Thus, if S learns the A-B pair 
We of a functional stimulus widely dis- 
from the nominal stimulus, poor 
t on the B-A test can be expected 
face S$ must supply the nominal stimulus as 
a reponse 


Prontems or MEASUREMENT 


undertaking a critical survey of 
dealing with the effect of independent 
on B-A associations, consideration 
be given to certain problems of measure- 
Bt which arise. It is, perhaps, necessary to 
Bt out that these problems arise only when 
tain analytical approach is accepted, but 
Mee this approach mirrors much current 
king in verbal learning, it will be used here. 
been fruitful to conceive of forward PA 
as consisting of two stages, response 
à and associative learning (Under- 
id & Schulz, 1960). During the first stage, 
is simply to learn what the responses 
E Guring the second stage, S must associate 
Tesponse with the proper stimulus. The 
landing of B-A learning may also 
from a similar distinction. Thus, in 
® to perform successfully on the B-A 
S will have to complete two learning 


. Secondly, B-A associative 
ung will be required so that each stimulus 
wanes associated with its appropriate re- 
- It is to be expected that variables 
differential effects upon the two 
5 of backward learning. 
Wen this background, it may now be 
Where the measurement problems arise. 
rai studies indicate that B-A performance 
ases with the degree of forward learning. 
= and Underwood (1958) gave 4, 8, 12, 
A-B trials on a PA list consisting of 8 
ense-syllable adjective pairs. This was 
i by 10 trials of B-A learning. Results 
Aled that both A-B and B-A learning 
to be functions of the number of A-B 
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A-B learning to criterion (i.e., give different 
numbers of A-B trials to equate for A-B 
strength) without producing differences in 
stimulus learning because the stimuli are 
maximally available for all groups. Therefore, 
any resultant difference between groups on 
the B-A trials can be attributed to differences 
in the associative stage of backward learning. 
It should be pointed out that when the 
variable produces a large effect on A-B learn- 
ing, the use of an A-B criterion will not be 
entirely satisfactory as a control for degree 
of A-B learning. When rates of approach to 
a criterion are substantially different, the 
final degree of learning will also differ; the 
group reaching criterion first will have a 
higher degree of learning (Underwood, 1964). 
When this situation obtains, it is necessary 
to determine A-B strength separately for the 
different levels of the variable either by em- 
ploying control groups which are given one 
additional A-B test trial, or by the use of 
probability analyses designed to predict im- 
mediate A-B strength. The strength of the 
B-A association can then be determined by 
loss scores (the difference between immediate 
-B performance and B-A test trial perform- 
ance) or percent loss scores, The situation 
is exactly the same as exists in the study of 
long-term retention where degree of learning 
must be controlled; therefore, the reader 
should consult Underwood (1964) for a com- 
plete discussion of the techniques available 
for control. 
A more complex solution is required when 
the nature of the variable (e.g., stimulus term 
M) is such that number stimuli cannot be 
used. Under these circumstances, the admin- 
istration of different numbers of A-B trials 
may result in differential availability of the 
stimulus terms, Thus, one might conclude 
that stimulus term M was an effective vari- 
able in B-A Performance, when, in fact, it 
was the different number of A-B trials that 
produced the effect, and not the M differ- 
ences directly, Of course, the different num- 
ber of trials seemed mandatory in order 
to equate for degree of A-B learning, No 
simple solution seems satisfactory. One can- 
not use a recognition test of B-A learning, 
as Leicht and Kausler (1965) have at- 
tempted, unless a paced Tecognition procedure 


is worked out which will prevent S fre 
running through each A term and testing | 
against the B term via the forward assot 
tion. If such running through is possible ty 
the nature of the test, it will not be de 
that the test is in fact measuring backwat 
associations, Prior familiarization of th 
stimuli in the lists to make them equal) 
available as was attempted by Asch ai 
Ebenholtz (1962) is also inappropriate de 
to the potential interference effects of fami 
iarization which may change as a functio 
of the variable in question (Horowitz | 
Larsen, 1963). 

Weiss (1965) has offered a rather comple 
solution, but it appears to be the most ate 
quate one available at present. Two group 
are given A-B learning to criterion, Ot 
group is then transferred to B-A learning atl 
the other group is transferred to B-A te 
paired (B-Ar) learning. In the B-Ar condi 
tion, the stimuli and responses of the Ad 
list are reversed, but, in addition, they att 
re-paired. It is known (Murdock, 1956) 7 
this paradigm produces negative transfet 
presumably due to the interference from tht 
backward associations developed during AD 
learning. Since both the B-A and the BA 
paradigms are employing the same responses, 
differences in availability are factored out 
What is left is a measure of the strength 0 
the backward association—the stronger thi 
association is, the greater should be the post 
tive transfer in the B-A paradigm, and prob 
ably the greater the hegative transfer in a 
B-Ar paradigm, Therefore, the difference be 
tween B-A and B-Ar transfer is a meast 
of backward associative strength. 

When the effect of a stimulus variable 0! 
the backward association is to be determined 
a B-A and B-Ar group would have to be tt! 
at each level of the variable, If the variabl 
has an effect on the backward associatiot 
then there should be an interaction betwee! 
the levels of the variable and the type of te 
(B-A or B-Ar). With this procedure, di 
ferent numbers of trials on forward learnin 
could be given in order to equate for degr 
of A-B learning. Any resultant differences !! 
availability of the A terms would be handl 
by the fact that both the B-A and B-A 
groups within a given level of the variabl! 
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would be operating with equal availability. 
The complexity of this procedure and the 
nature of the necessary assumptions about 
learning and transfer indicate that it should 
be applied with caution when the effect of a 
stimulus-term variable is in question. 


THE VARIABLES IN B-A LEARNING 


The main implication of the observed cor- 
relation between A-B and B-A strength lies 
in its importance for studying the effects of 
other variables on B-A learning. If A-B 
strength is not equal for the different levels 
of the variable or if A-term availability is 
not equal, it will be impossible to conclude 
about the effect of the variable on the B-A 
association, A review of the literature reveals 
that virtually all of the published studies on 
B-A learning as a function of some vari- 
able have problems either in equating A-B 
strength, A-term availability, or both. The 
studies are summarized in Table 1 where the 
design problems are indicated for each study. 
Column 1 indicates the variable in question 
and the reference, Column 2 indicates whether 
or not there is a problem with respect to 
stimulus availability, and Column 3 indicates 
whether or not there is a problem with respect 
to degree of A-B strength. 

It is apparent from Table 1 that virtually 
all of the studies cited are confounded by 
either differences in stimulus availability, or 
differences in degree of A-B learning, or both. 
There are only four comparisons that have 
received an unconfounded test with respect 
to their effect upon B-A learning. Hunt 
(1959) has shown that increasing stimulus M 
Mcreases performance on a backward test 
and that stimulus-term pronunciation has no 
effect on backward performance. Hakes 
(1965), however, has found that stimulus 
pronunciation increases backward perform- 
ance by a small amount. Cassem and Kausler 
A have shown that backward perform- 
ben, at a 4 : 4-second rate is superior to per- 
a oe at a 2: 2-second rate. It should be 
sank that future research on backward 
a g should focus on making clean exami- 
ority ted the effects of variables, with pri- 
Bank eing given to those variables known 

Miluence forward learning. 


TABLE 1 


Review OF STUDIES OF BACKWARD LEARNING 


Problems of 
Variables and experiments Stimulus A-B (forward) 
availa- degree of 
bility learning 


Number of A-B presentations 


Jantz and Underwood yes does not apply 
(1958) 
Leicht and Kausler (1965) | yes does not apply 


Meaningfulness (M) 


Cassem and Kausler (1962)| no yes 
Epstein (1962) no yes 
Hunt (1959) 
stimulus M no no 
response M no yes 
Jantz and Underwood no yes 
(1958) 
Leicht and Kausler (1965) | no yes 
Morikawa (1959) no yes 
Newman and Gray (1965)| no yes 
Richardson (1960) no yes 
Similarity 
Feldman and Underwood 
(1957) 
stimulus similarity yes yes 
response similarity yes yes 
Morikawa (1959) yes yes 
Newman and Buckhout no yes 
(1962) 
Percentage occurrences of 
response numbers 
Dron and Boe (1964) no yes 
Kausler, McLaughlin, and | yes no 
Kulik (1962) 
Stimulus-term pronunciation 
Hunt (1959) no no 
Hakes (1965) no no 
Presentation rate 
Cassem and Kausler (1962) 
Forward rate no yes 
Backward rate no no 


It should also be pointed out that the 
studies cited in Table 1 which used a con- 
stant number of trials during forward learn- 
ing are listed as being unconfounded by 
stimulus-availability problems. This does not 
mean that such a procedure will allow for 
conclusions about the effect of the variable on 
the backward associative stage. A “no” entry 
under these circumstances means only that 
the procedure has not produced availability 
differences due to different numbers of trials 
being administered. This “no” should not be 
construed to mean that the variable itself has 
not produced availability differences. 
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For example, the Hunt (1959) experiment 
on stimulus M used a criterion during A-B 
learning, but since stimulus M had no 
effect on A-B learning rate, it is listed in 
the table as unconfounded. The results indi- 
cate that increasing stimulus M produced in- 
creased B-A performance. The point is that 
one cannot tell if the effect was in the stimu- 
lus-learning stage, the B-A associative stage, 
or both stages. A stimulus-learning effect 
would be expected since it is known that M is 
directly related to availability (Underwood & 
Schulz, 1960); therefore, the difference ob- 
served by Hunt may be due to increasing M 
producing increased availability of the re- 
sponses of the backward test. It is impossible 
to conclude that stimulus M had any effect on 
backward associative learning. It has already 
been pointed out that if one wishes to deter- 
mine the effect of a stimulus variable on 
backward associative learning, a procedure 
along the lines of the Weiss (1965) technique 
is required. 

Finally, most of the studies cited in Table 
1 which have used a criterion during A-B 
learning may nevertheless be confounded by 
differences in A-B strength. This is because, 
as pointed out earlier, the degree of learning 
will not be equal if the rate of approach to 
criterion is different for different groups. Con- 
trol groups or probability analyses are required 
to assess immediate strength. The studies of 
Hunt (1959) and Epstein (1962) are con- 
founded with respect to A-B strength in a 
more subtle manner. Hunt’s procedure in- 
volved the dropping out of items as they were 
learned to a two-successive-correct criterion. 
Overall learning was carried to this criterion 
for all items in the list; then the B-A test was 
administered, In Epstein’s experiment, as soon 
as an item was anticipated correctly once, it 
was “turned over” and presented for B-A 
testing on the next trial while other items 
were still being anticipated in the forward 
direction. It seems likely that one or two 
correct responses for an item of high M will 
result in a greater probability of being cor- 
rect given one more A-B trial than the same 
number of reinforcements for a low M item 
(Underwood, 1964). Thus, degree of forward 
learning was different for the different levels 
of M. 


One other problem of method deserves 
mention here. It is conceivable that the dif- 
culty of the associative learning may be dif 
ferent for the two pairing directions, even if 
both were learned in the same direction. Given 
two groups both learning in the forward direc 
tion, one learning A-B and one learning B-A, 
it is possible that the two associative stages 
differ in their intrinsic difficulty. For this 
reason, when A-B forward learning is com: 
pared to B-A backward learning, part of the 
difference may be due to pairing difficulty 
independent of the direction of learning. B-A 
performance on a backward test ought to be 
compared to a control group that learne 
B-A in the forward direction. This procedure, 
suggested by Richardson (1960), will contrd 
for difficulty of pairing, so that any differ 
ences between the groups can be attributed to 
direction of learning. Most of the studies cited 
in Table 1 have failed to control for this 
possibility. 

If A-B and B-A can differ in intrinsic difi- 
culty, a rather unusual finding is made poss 
ble. Suppose that the B-A pairing is easie 
than the A-B pairing, but that Ss are learning 
A-B in the forward direction. Under thes 
circumstances, one might find that backwarl 
performance on the easier B-A pairing woull 
be greater than forward performance. Also, i 
the two associative stages (backward ani 
forward) can be carried out independently, 
certain Ss under certain conditions might 
learn the backward association before the 
forward, a fact that might also be revealed it 
greater backward performance. 

In summary, it may be stated that there i 
a clear need for a systematic series of experi: 
ments, properly designed, to determine tht 
influence of different variables on backwatl 
learning. It would also appear advisable ti 
separate the effects of any variable on tht 
stimulus-learning stage from effects on th 
backward associative stage. The stimulus 
learning effects can most easily be gotten by 
measuring free recall of the stimulus term 
following A-B learning (e.g., Newman, Cut 
ningham, & Gray, 1965); whereas, the studi 
of backward associative effects will requitt 
the use of B-A paced anticipation tests wil! 
highly available stimuli or with some pt 
cedure analogous to that of Weiss (1965). 
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ASSOCIATIVE SYMMETRY 


In 1962, Asch and Ebenholtz proposed the 
principle of associative symmetry. “When an 
association is formed between two distinct 
terms, a and b, it is established simultaneously 
and with equal strength between b and a [p. 
136].” This principle clearly implies only one 
associative process which results in equal 
strength in both directions. Thus, these writers 
contend that the associative stage of forward 
learning and the associative stage of backward 
learning are identical. 

The statement of the associative symmetry 
principle represented a disagreement with the 
ongoing thinking about backward associa- 
tions. It was, apparently, being tacitly as- 
sumed by most researchers that B-A learning 
was something separate from A-B learning, 
something incidental to A-B learning, and of 
considerably less strength. The data clearly 
show that backward performance is almost 
always poorer than forward performance. 
There have been exceptions (Guthrie, 1933; 
Hermans, 1936; Wohlgemuth, 1913). Stod- 
dard (1929) and Asch and Lindner (1963) 
have even found B-A performance superior 
to A-B. But the most common finding is that 
forward performance surpasses backward per- 
formance, 

This apparent asymmetry between A-B and 
B-A performance is explained by Asch and 
Ebenholtz (1962) on the basis of availability 
of the stimulus terms. Their interpretation is 
that since S does not have to learn the nomi- 
nal stimuli of the PA list during forward 
learning, the stimulus-learning stage of B-A 
learning is not carried out. The nominal 
stimuli are not “available” to S at the time of 
B-A recall. But despite this, Asch and Eben- 
holtz feel that the associative stage of B-A 
learning has taken place since A-B associative 
learning is complete. They interpret the usual 
finding of superior A-B to B-A performance 
as being due entirely to a difference in “re- 
sponse” learning with no difference in asso- 
ciative learning, Thus, the completed back- 
ward association leads to a response which is 
not available (an association leading to “noth- 
ing”) or more likely it leads to the functional 
stimulus, Since S may select a functional 
stimulus that differs from the nominal stimu- 


lus, the B-A association for him would be 
between the response and the functional stim- 
ulus. Scoring as correct only recall of the 
nominal stimulus (as is usually done) is 
“loading the dice” against B-A recall. 

One might also produce greater backward 
than forward performance by loading the dice 
against forward recall. Suppose the A-B pairs 
consisted of single-digit numbers as stimuli 
and difficult trigrams as responses. If the B-A 
test were administered before S had a chance 
to complete the response learning of the tri- 
grams, superior performance would result in 
the backward direction (trigram-number) 
than in the forward direction (number-tri- 
gram). Asch and Lindner (1963) have shown 
just this result. 

Availability of the Stimulus. The conten- 
tion of Asch and Ebenholtz that the nominal 
stimuli are less available than the responses 
in most PA experiments is well supported by 
data. Asch and Ebenholtz (1962, Experiment 
I) gave Ss PA learning of eight paired CVCs 
to a criterion of 4/8 or 8/8 correct responses. 
Then a 2-minute free-recall test was admin- 
istered where S had only to write down as 
many stimuli and responses as he could re- 
member. The results showed that significantly 
fewer stimuli were recalled than responses. 
Similar results have been found by Newman 
and Gray (1964); Newman et al. (1965); 
Battig, Brown, and Nelson (1963); and by 
Morikawa (1959). Whenever conditions are 
suitable for stimulus selection (e.g., low M, 
low similarity), we may expect that recall of 
the nominal stimuli will be inferior to recall 
of the responses; therefore, a fair test of the 
symmetry position is impossible. The solution 
to this problem is the use of highly available 
stimulus and response terms (¢.g., single-digit 
numbers and single letters) which will ensure 
equal stimulus and response availability, a 
condition manditory for a test of symmetry. 

Attempts to Equate Availability of Stimuli 
and Responses. Asch and Ebenholtz (1962, 
Experiments V, VI, and VII) attempted to 
equate stimulus and response availability by 
giving free learning of all the units (stimuli 
and responses) prior to PA learning. It should 
be pointed out that this is a potentially 
troublesome procedure since the free learning 
might produce interitem connections which 
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could subsequently interfere with the PA 
learning (Horowitz & Larsen, 1963). Such 
interference could modify the nature of the 
learning processes, and to generalize from 
these situations to other PA situations not 
involving these interference effects would not 
seem appropriate. Despite the free learning of 
all list items, Experiments VI and VII still 
showed a superiority of A-B to B-A recall 
(possibly produced by the interference ef- 
fects). Experiment V, however, showed no 
difference between A-B and B-A recall 
thereby supporting the symmetry principle. 
The conditions of Experiment V consisted of 
PA learning by a free-learning-of-pairs tech- 
nique. During forward learning, the S had to 
produce both the stimulus and the response 
of each pair. The procedure was alternate 
study and test with S given 1 minute for free 
recall on test trials. Under these conditions, 
forward and backward recall were equal. 
Under standard PA anticipation, without the 
prior free learning of the units, they found the 
usual superiority of A-B performance. 

The failure of prefamiliarization via free 
learning to eliminate the difference between 
A-B and B-A performance in Experiments VI 
and VII of the Asch and Ebenholtz mono- 
graph was attributed by the authors to a 
reestablishment of greater availability of re- 
sponse terms than stimulus terms under PA 
anticipation learning. This would imply that 
PA learning produces a decrease in the avail- 
ability of the nominal stimuli. If Asch and 
Ebenholtz are correct, an experiment which 
familiarized the stimuli and responses by free 
learning and followed this with PA learning 
and a test of availability should find that the 
responses are more available than the stimuli 
and that the nominal stimuli are less available 
than they were at the end of the free learning. 

On the whole, then, the Asch-Ebenholtz 
experiments are rather unconvincing in that 
only one experiment showed no difference be- 
tween A-B and B-A recall despite the pre- 
familiarization. Of course, the prefamiliariza- 
tion itself may be responsible for these re- 
sults due to possible interference effects, Even 
the experiment that displayed symmetry was 
not done with standard PA procedures; there- 
fore, their results offer rather weak evidence 
for symmetry as a rule for PA learning. 


The strongest evidence for symmetry come: 
from other sources. Houston (1964a) has 
shown that B-A recall equals A-B recall wher 
the functional stimulus is the response of the 
B-A test. Using compound stimuli (colors an 
CVCs) and single-digit numbers as responses 
he found that B-A recall to the colors (pre 
sumably the functional stimuli) was equal te 
A-B recall to the digits. B-A recall to the 
nominal stimulus (the color-CVC compound) 
exhibited marked asymmetry, presumably 
because the CVCs were not available as re 
sponses for the B-A test. 

Richardson (1960) compared forward ant 
backward recall of a list consisting of single 
digit numbers and three-letter words. Follow 
ing 10 trials of forward learning, it would seem 
fair to conclude that the words were highly 
available, and, of course, the numbers wouli 
be highly available also. Backward perform: 
ance (word-number) was equal to forwati 
performance (number-word), supporting sym: 
metry. If however, the backward performant 
on the word-number list is compared to Trial 
10 performance of another group whid 
learned the word-number list in the forwatl 
direction, then backward performance wa 
95.3% of the forward performance. This fig 
ure is probably slightly large since forwatl 
performance should have been measured by 
giving one additional forward trial (Trial 11) 
or by the use of probability analyses. We di 
not know how the forward group learnin 
word-number pairs would have performt 
had they been given a Trial 11; since bath 
ward testing was equivalent to an I! 
trial, a perfect comparison is not possible 
Nevertheless, the Richardson results, if mi 
supporting symmetry, at least appear to shof 
that forward-backward differences are quit 
small. 

Another method for equating availabilll 
of stimuli and responses is to have some © 
all of the stimuli serve as responses in 0 
pairs in the list. Thus, one pair might be AS 
and a second C-A, a situation insuring 
availability of both A and B. If symmetry” 
the rule, the A-B pair should be recall 
equally in the forward and backward ditt 
tion. 

Horowitz, Brown, and Weissbluth (196! 
had Ss learn a PA list composed of six 
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two of which were designed to be symmetrical 
pairs via the above procedure. Following 
learning, each S gave two free associations to 
each of the units in the list. The results 
showed an equal number of forward and back- 
ward associations for the critical pairs and 
were interpreted as favoring symmetry. There 
was, however, a tendency for the forward 
association to be given before the backward, 
a situation which may indicate slight asym- 
metry. 

Umemoto and Hilgard (1961) and Young 
and Jennings (1964) have investigated B-A 
recall following the learning of double-func- 
tion lists (each unit serves once as a stimulus 
and once as a response, in two different pairs). 
Both experiments showed considerable loss 
from the last trial of forward learning to the 
first trial of backward learning which might 
indicate that the forward association was 
stronger than the backward. However, the 
competition between the two associations may 
have been responsible for this decrement. 
Such competition was not present in the 
Horowitz et al. (1964) experiment since S 
was allowed to give two responses. 

Battig and Koppenaal (1965) attempted to 
eliminate competition effects in their experi- 
ment using a double-function list (to insure 
availability of all units) by using a free- 
recall test of either the forward or backward 
association, They found that recall was sig- 
nificantly poorer in the backward direction, 
despite the fact that all units were available. 
Assuming that double-function learning in- 
Sures equal availability, this experiment at 
least demonstrates that symmetry is not al- 
Ways the rule. 

The above experiments on double-function 
learning, coupled with those of Primoff (1938) 
and Young (1959), indicate that Ss can in- 
deed learn such a list, This in itself seems to 
require modification of the symmetry position 
Which in its present form would have to pre- 
dict that such learning would be impossible. 

f an association must always be equal in 
both directions, then the learning of A-B and 
B-C pairs in the same list would probably be 
possible. This may, however, be only a 
minor problem for symmetry theorists; B as 
4 stimulus (in the left-hand position) may 
have additional cue value which will lead to 


the response C instead of to A (via the back- 
ward association). Moreover, it is a fact that 
double-function learning, while not impossi- 
ble, does proceed quite slowly (Primoff, 1938; 
Young, 1959), 

On the other hand, the finding of superior 
forward recall after double-function learning 
(Battig & Koppenaal, 1965) does not neces- 
sarily refute symmetry for the case where 
there are no double-function pairs. It may be 
that extinction of the backward association 
takes place during double-function learning 
(being necessary to learn the forward associa- 
tion) and that this is why backward recall is 
inferior. Such an assumption, of course, im- 
plies that there are two associative processes 
that can be carried out separately. With 
normal PA learning, where extinction of the 
backward association would not be necessary 
in order to perform the forward association, 
the two may or may not develop to equal 
strength. 

Underwood and Keppel (1963) and Voss 
(1965) have investigated PA learning where 
Ss were required to learn bidirectionally (i.e., 
the Ss had to anticipate B given A on half 
the trials, and A given B on the other half). 
If symmetry is the rule, then bidirectional 
associative learning should take no longer than 
unidirectional learning. Underwood and Kep- 
pel found no difference in trials to a one 
perfect criterion between a bidirectional group 
and a group tested always on A-B or al- 
ways on B-A. Internal analyses indicated 
that the probability of a correct response 
after the first correct response was signifi- 
cantly lower in the bidirectional group thus 
supporting asymmetry. They interpreted bi- 
directional learning as simply a summation oi 
the learning of two separate associations, A-B 
and B-A. 

Voss (1965) also found evidence that B-A 
associative learning was not complete, given 
that A-B associative learning was complete. 
However, his measure of availability (the 
point at which a response is first given as a 
response anywhere in the list) is invalid 
(Ekstrand, in press). 

The Underwood and Keppel (1963) and the 
Voss (1965) experiments seem to indicate 
that something more is required in bidirec- 
tional learning than in unidirectional learning, 
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but this may or may not be additional as- 
sociative learning. An alternative might be 
that bidirectional learning simply involves 
twice as much response learning (both A and 
B must be learned) with equivalent associa- 
tive learning. If S can anticipate B given A, 
but fails to anticipate A given B, it may be 
because the A term is not yet fully available, 
and not because the B-A association is weaker 
than the A-B. This again leads to the unusual 
possibility that the B-A association may be 
leading to “nothing” (i.e., it is there and 
equal in strength to A-B, but it leads to an 
unavailable A term). 

In two experiments, Murdock (1962, 1965) 
has shown associative symmetry with pro- 
cedures substantially different from those 
under consideration in the present review. In 
the 1962 experiment, dealing with short-term 
memory of PA items, Murdock instructed his 
Ss that recall would be in the backward di- 
rection for some of the items and in the for- 
ward direction for the others. In the 1965 ex- 
periment, employing dichotic presentation of 
the two units within a pair, Murdock ap- 
parently also instructed Ss that recall of 
either unit, given the other unit as a cue, was 
to be tested. Both experiments showed asso- 
ciative symmetry. While these experiments do 
not show symmetry for standard PA learning 
when Ss are uninstructed as to the B-A test, 
they might, nevertheless, be used in support 
of the symmetry position. 

Transfer Experiments. If forward and back- 
ward associations are equal in strength, then 
transfer (either positive or negative) should 
be equivalent for situations requiring the 
use of previous forward or backward associa- 
tions. 

Murdock (1956) found that transfering to 
B-A was inferior to continuing with A-B, but 
the difference was not significant, Both 
groups showed significant positive transfer 
when the learning of entirely new items was 
the control. In another set of conditions de- 
signed to produce negative transfer, Murdock 
(1956) compared A-B, A-Br (S-R re-paired ) 
transfer and A-B, B-Ar (R-S re-paired) trans- 
fer with control learning of new items. Both 
re-paired conditions showed negative transfer, 
but the two did not differ. These results are 
in contrast to those found by Asch and Eben- 
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holtz (1962, Experiment VII). They found 
that transfer from A-B to B-A was signif- 
cantly less than “transfer” from A-B to A-B. 
This was true despite the fact that all items 
had been prefamiliarized. 

Murdock (1958) compared two paradigms 
designed to yield negative transfer from 
backward associations (A-B, B-C; A-B, C-A; 
see also Houston, 1964c) with the standard 
A-B, A-C paradigm which yields negative 
transfer from forward associations. All three 
paradigms produced negative transfer with 
respect to a C-D control, but the backward 
paradigms did not produce less negative trans- 
fer than the forward paradigm. Harcum 
(1953) found similar results although the 
negative transfer in his A-B, C-A condition 
was not significant. 

Using three-stage paradigms, Horton and 
Kjeldergaard (1961) found no significant 
differences in amount of positive transfer be- 
tween paradigms dependent upon forward as- 
sociations and those dependent upon backward 
associations for transfer. Horton and Hart- 
man (1963), however, did find significantly 
more positive transfer in a forward three- 
stage paradigm than in a backward paradigm, 
Peterson, Colavita, Sheahan, and Blattner 
(1964) also found forward transfer greater 
than backward transfer when consonant tri- 
grams were used as learning materials. With 
100% association value nonsense syllables, 
they found no difference. 

The above series of experiments indicates 
that the transfer value of forward and back- 
ward associations may not always be different. 
However, the use of mixed-list paradigms in 
some cases, and the highly complex processes 
that are involved in multistage designs do 
not make conditions ideal for testing theo- 
retical notions. To infer symmetry or asym- 
metry on the basis of these paradigms would 
be premature. 

Extinction of Associations. Barnes and Un- 
derwood (1959) have shown that when the 
stimuli and responses of two associations con- 
form to an A-B, A-C (hereafter A-C) para 
digm, the learning of the A-C association will 
produce extinction of the A-B association. If 
an A-C paradigm is used, it can be seen thal 
it is only the forward association, A-B, which 
is subject to extinction. But what happens t0 
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the B-A association? If the forward and back- 
ward associative processes are identical, then 
the B-A association must also be lost. A simi- 
lar situation is present if the A-B, C-B (here- 
after C-B) paradigm is used, only now it is 
the backward associations, B-A and B-C, 
which conform to the Barnes and Underwood 
extinction paradigm. Again, if the associative 
processes of forward and backward learning 
are identical, extinction of the backward as- 
sociation should unavoidably affect the for- 
ward association as well. 

Keppel and Underwood (1962) reported 
that extinction of a forward association did 
not affect the backward association. Their Ss 
learned two lists conforming to the A-C para- 
digm. Materials were CVC-adjective pairs. 
Learning of the first list was carried to a one 
perfect criterion; learning of the second list 
was for 15 trials. A control group learned 
two lists in an A-B, C-D (hereafter C-D) 
paradigm. Following second-list learning, the 
Ss were given a list of first-list responses and 
were required to produce the appropriate 
stimuli, The A-C group did not show any loss 
of the backward association, despite the fact 
that the forward associations had undergone 
extinction. Such results argue against the 
identity of forward and backward associative 
Processes; however, they are not conclusive 
because the A-C paradigm necessarily pro- 
duces greater availability of the A terms since 
they appear in both lists. Therefore, the lack 
of retroactive inhibition (RI) of the back- 
ward association in this paradigm may only 
be due to better guessing potential. That such 
was the case has recently been demonstrated 
by Houston (1964b). Using highly available 
stimuli, he did demonstrate unlearning of the 
backward association in the A-C paradigm. 

There are two studies which show that ex- 
tinction of the backward association leads to 
a reduction in strength of the forward associa- 
tion. McGovern (1964) compared the C-B 
and C-D paradigms on recall of the first-list 
forward associations. The C-B paradigm 
should produce extinction of the backward 
association; and, if there is only one associa- 
tive process, extinction of the forward associ- 
ation, Learning of the first list was to 1 per- 
fect trial, that of the second list for 15 trials. 
There followed a free-recall test of List-1 for- 


ward associations in which the C-B group 
showed RI. This RI of the forward associa- 
tion, produced by extinction of the backward 
association, has also been found by Green- 
bloom and Kimble (in press) using a pro- 
cedure quite similar to McGovern’s. 

The results of these experiments on extinc- 
tion do not indicate that the backward and 
forward associations can be extinguished in- 
dependently of each other. Rather, it appears 
that extinction of one is likely to produce ex- 
tinction of the other, a fact which supports the 
assumption of a single associative process. 

Learning versus Performance. Another issue 
concerns possible performance differences in 
utilizing forward and backward associations. 
Perhaps learning obeys the symmetry princi- 
ple, while performance effects produce asym- 
metry. Backward recall is usually tested fol- 
lowing a series of forward performance trials. 
The failure of B-A recall to equal A-B recall 
may be due to the fact that Ss have less prac- 
tice at recall in the reverse direction, or to a 
performance tendency to respond only in the 
forward direction. Perhaps there is some 
tendency not to give stimuli as responses; this, 
of course, would be interfering with B-A re- 
call. Rothkopf and Coke (1963) have evi- 
dence related to this interpretation. They re- 
ported that forward anticipation was better 
than backward in a sentence-learning task 
when Ss rehearsed only under one method, 
namely forward rehearsal. When rehearsal 
was in both directions for some of the sen- 
tences, they found that recall was equal in 
both directions. However, Ss under the dual- 
rehearsal method may have practiced all sen- 
tences in both directions. Their interpretation 
was that dual rehearsal broke a response set 
which was built up in unidirectional rehearsal 
conditions. The results were not conclusive; 
so it is clear that further research on possible 
performance differences between forward and 
backward associations would be valuable. 
Such differences could be responsible for a 
portion of the asymmetry that is observed in 
many experiments. 

Conclusions. The above considerations about 
the principle of associative symmetry raise as 
many questions as they answer. The concept 
of availability differences certainly is an im- 
portant one, which must be kept in mind if 


60 


one is interested in the associative phase of 
B-A learning. The failure to equate for avail- 
ability of stimuli and responses has undoubt- 
edly led to a considerable overestimation of 
the difference between A-B and B-A associa- 
tive learning. On the other hand, it is not 
clear that equation of availability will en- 
tirely eliminate the difference between for- 
ward and backward learning under all cir- 
cumstances. 

Only the Houston (1964a) experiment of- 
fers clean, direct evidence in support of sym- 
metry although there are several others which 
seem to show support for the notion (e.g., 
Horowitz et al., 1964; Richardson, 1960). On 
the other side, only the Battig and Koppenaal 
(1965) experiment demonstrates clear asym- 
metry; or, in other words, there is a shortage 
of evidence against symmetry. It does ap- 
pear that the difference between forward and 
backward associative learning has been dras- 
tically overestimated, and that if symmetry is 
not the rule, asymmetry will be very small. 

It is clear that the simplest way to attack 
the problem is to compare the forward and 
backward learning of PA lists in which both 
the stimuli and responses are readily available 
units. Thus, one could compare single-digit 
numbers paired with single letters on A-B and 
B-A learning, with half the Ss learning A-B 
and B-A in the forward direction and half in 
the backward direction. If the results support 
the symmetry position, a systematic investi- 
gation of the conditions under which sym- 
metry holds would be called for, since the 
Battig-Koppenaal results indicate that sym- 
metry may not be a universal law of associa- 
tive learning. 


Tue ROLE or BACKWARD ASSOCIATIONS 


In this final section, an attempt will be 
made to enumerate and discuss the implica- 
tions of backward associations for other types 
of verbal-learning tasks. Backward associa- 
tions may be important to the understanding 
of: (a) transfer and the closely related fields 
of retroactive (RI) and proactive (PI) inhi- 
bition, (b) mediation, (c) the effect of S-R 
similarity, and (d) the effect of prefamiliari- 
zation via PA learning. 

Transfer. It has been shown that backward 
associations do transfer from one list to an- 
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other (e.g., Harcum, 1953; Murdock, 1956, 
1958). Consideration of such associations may 
be crucial to the understanding of the transfer 
effects of different paradigms. Houston 
(1964c) has already described a transfer sur- 
face in which backward associations probably 
play an important role. In addition, backward 
associations have been implicated in two of 
the classical transfer paradigms: C-B and 
A-Br. 

Twedt and Underwood (1959) reported 
negative transfer for the C-B paradigm using 
mixed and unmixed lists of 12 pairs of ad- 
jectives. They attributed the negative trans- 
fer to the interference from first-list B-A as- 
sociations. The B-A associations of these two 
lists conform to the A-C transfer paradigm, 
thus, there should be negative transfer. 

Kausler and Kanoti (1963) reported a 
replication of the Twedt and Underwood 
findings with the addition of B-A recall meas- 
ures. They found that lower C-B transfer was 
accompanied by less B-A recall of the second 
list, indicating that where interference was 
heaviest, Ss were experiencing the greatest 
difficulty with backward learning. 

The backward association is also believed 
to be important in the A-Br paradigm. Porter 
and Duncan (1953) reported negative trans- 
fer for this paradigm when the materials were 
two-syllable adjectives. They interpreted this 
as being due to interference from both the 
forward and backward associations. In this 
paradigm, both associations conform to an 
A-C paradigm; therefore, negative transfer 
should be even greater than in the C-B ot 
A-C paradigms. Postman (1962), Twedt and 
Underwood (1959), McGovern (1964), and 
Jung (1962) all report such findings. As 
Postman and others have suggested, A-Br 
transfer can be interpreted as a summation 
of interference from the forward and back- 
ward associations. 

Mandler and Heineman (1956) actually 
found positive transfer for the A-Br para- 
digm, but they used low M materials. This 
may have resulted in a reduction in the inter- 
ference from B-A associations as well as pro- 
ducing greater positive transfer due to re- 
sponse learning. McGovern (1964) compared 
the A-Br paradigm with the C-B and A-C 
paradigms in an RI design. Both the A-C 
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and the C-B paradigms showed RI and the 
amount of RI for the two paradigms combined 
just about equaled the RI in the A-Br condi- 
tion. 

Keppel and Underwood (1962) have dem- 
onstrated RI of the backward association. 
Whenever the backward associations of two 
lists conform to the A-C paradigm we can 
expect extinction or unlearning of these as- 
sociations. The amount of RI was greatest 
for the C-B and A-Br paradigms and least for 
the C-D. There was no significant RI for the 
A-C paradigm; but, Houston (1964b) has 
found RI of the backward association in the 
A-C paradigm. Ellington and Kausler (1965) 
have performed a replication of the Keppel 
and Underwood findings with the C-B para- 
digm, They varied the number of trials (1, 
5, 10, or 20) on the second (C-B) list and 
then tested for List-1 B-A associations by 
free recall. The results showed that B-A as- 
sociations became increasingly unavailable 
with practice on the second list. A control 
condition revealed that the effect was not 
just forgetting. 

From the above experiments, it can be 
seen that the role of the backward association 
in transfer and RI experiments is considera- 
ble. The role of B-A learning in proactive 
situations remains to be demonstrated, but it 
may be anticipated that such associations 
will be found important. 

Mediation. Storms (1958) showed that 
backward association may be involved in 
mediation. His Ss first learned a list of 5 
A-C pairs and 5 A-X pairs. Then came a list 
of 10 A-B pairs, The B and C terms were 
associated (as judged by association norms) 
in the B-C direction but C-B associations were 
relatively weak, although present. The results 
showed that the pairs preceded by A-C learn- 
mg were learned faster than those preceded 
by A-X learning. Storms, however, interpreted 
this as “apparent” backward association. His 
feeling was that the C-B forward association, 
although weak in the norms, was primed by 
the appearance of the B terms in the second 
list (see also Cramer, 1964). 

McCormack (1961) set out to show that 
backward associations do play a mediating 
role in some paradigms. He built A-B asso- 
Cations experimentally rather than inferring 


them from the norms as Storms had done. 
Then Ss learned a mixed list of C-B and C-D 
pairs, and finally they were tested on A-C 
pairs. The A-C pairs which had been preceded 
by C-B learning were learned faster than those 
preceded by C-D learning, thereby showing 
that backward associations can mediate new 
learning. 

The experiments of Horton and Kjelder- 
gaard (1961), Horton and Hartman (1963), 
and Peterson et al. (1964) have already been 
mentioned in an earlier section. They show 
that mediation based upon B-A association is 
possible; indeed, the effects were sometimes 
as large as when the forward association was 
the crucial mediator (see also McGehee & 
Schulz, 1961; Seidel, 1962). Since all of these 
experiments have involved quite complex de- 
signs, which undoubtedly have many factors 
participating besides B-A association, they 
should be taken only as evidence suggestive 
of the possible effects of backward associations 
in multistage designs. The reader is referred 
to Jenkins (1963) for a more detailed analy- 
sis of the complex mediational designs. 

An interesting situation arises with respect 
to backward association if the original for- 
ward learning is accomplished by the use of 
mediation. For example, if the pair is dog-9, 
the learning may be carried out through the 
chain dog-cat-9. Then, in order to perform on 
a B-A test, two backward associations are 
called for, 9-cat and cat-dog. Such situations 
should be quite sensitive to manipulations in 
B-A test rate since two associations should 
take more time to “run off.” Thus, variables 
which affect the likelihood of learning via 
mediation, and variables which influence the 
types of mediational responses made will 
probably be important determinants of back- 
ward performance. 

Stimulus-Response Similarity. When the 
similarity between the stimuli and responses 
of a PA list is manipulated, backward asso- 
ciations probably play an important role in 
the effect upon performance. Primoff (1938) 
found that a PA list in which each item served 
as a stimulus for one pair and a response for 
another pair was extremely difficult to learn. 
On the basis of a large percentage of back- 
ward errors, he inferred that backward associ- 
ations were causing the difficulty. Young 
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(1961) felt that Primoff’s experiment was 
just dealing with an extreme case of stimulus- 
response similarity, namely, identity. He 
varied degree of similarity and found a nega- 
tive relationship between learning and simi- 
larity. 

Underwood (1963) has pointed out that the 
Young and Primoff notions can be reduced 
to a common base since S-R similarity most 
likely has its effect via the backward associa- 
tions, The higher the similarity the greater is 
the tendency for similar items to elicit each 
other’s correct responses. For example, take 
the two pairs A-B and B’-C where the prime 
indicates high similarity. When B’ is pre- 
sented, it may elicit the response A by the 
backward association between B and A. The 
backward association therefore seems to be 
the key to S-R similarity effects. 

The Young and Primoff experiments em- 
ployed formal similarity among stimulus and 
response terms. A recent experiment by 
Young and Jennings (1964) has used mean- 
ingful similarity, that is, synonyms. The re- 
sults showed some interference, but the dif- 
ference between this condition and a condition 
involving no similarity was not significant. 

Newman (1964) and Umemoto and Hil- 
gard (1961) have confirmed the inhibitory 
effect of high S-R similarity. In addition, 
Umemoto and Hilgard have shown that cer- 
tain circumstances will yield facilitation with 
higher S-R similarity. For example, a list 
with the two pairs YAV-perfect and faultless- 
YAV was learned significantly faster than a 
list where the word units were unrelated. 
Thus, B-A associations can either facilitate or 

hibit performance. 

Prefamiliarization. One method of famili- 

rizing items to be learned in a subsequent 

‘A or serial list is to have Ss learn a PA list 
which contains these items as response terms. 
Underwood and Schulz (1960) have used 
this procedure prior to serial (Experiment 3) 
and PA (Experiment 4) learning. In both ex- 
periments the stimuli of the familiarization 
lists were either nonsense forms or nouns. In 
both experiments the effect of familiarization 
was much less than expected, with an ap- 
parent negative effect of stimulus familiariza- 


tion in Experiment 4. They interpreted this a 
being due to the formation of B-A associations 
during familiarization learning which inter 
fered with the serial or PA learning in th 
second stage. The amount of such interfer 
ence was less with the form lists than with 
the noun lists. With the forms as stimuli, B-A 
associative interference should be less since 
the forms would not be available as responses, 
With nouns as stimuli, backward associative 
interference should be at a maximum. 

This interpretation has received recent con- 
firmation from an experiment by Simon and 
Wood (1964). They also showed that under 
special conditions the B-A associations de 
veloped in familiarization learning can facili- 
tate subsequent PA learning as well as in- 
hibit it. Specifically, they familiarized stimuli 
of a PA list by making them responses in an- 
other PA list. They then varied the relation- 
ship between the stimuli of the familiarization 
list and the responses of the test list, either 
associated items or nonassociated., With as- 
sociated items, they found a positive effect; 
with nonassociated items, a negative effect of 
familiarization. Thus, if the test pair were 
Foj-Chair, facilitation was found if the fa 
miliarization pair were Table-roy, and inter- 
ference was found if it were Music-roy. Back 
ward associations formed during PA familiari- 
zation can either facilitate or inhibit learning 
of a subsequent PA list. 

It is apparent that the concept of backward 
association has become a useful explanatory 
mechanism for understanding verbal learning. 
It has already been invoked in several situr 
tions and we may expect that its usefulnes 
will increase. Furthermore, if the principle 
of associative symmetry holds for many sitt- 
ations, the importance of the backward a- 
sociation will rival that of the forward ass 
ciation. For this reason, it will be necessary t0 
undertake a thorough investigation of th? 
effects of different variables on B-A learning 
and to examine all verbal-learning situations 
for possible backward-learning effects, Rel 
tively uncharted areas with respect to back 
ward learning are numerous (e.g., learning t 
learn B-A associations, the role of B-A & 
sociations in forgetting and PI). 
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Psychotherapeutic and psychoanalytic methods have a poor record in the 
treatment of sexual deviations, so that attempts to apply aversion therapy 
techniques are justified and are becoming increasingly frequent. Aversion 
therapy techniques used to date in the treatment of homosexuality and other 
sexual deviations are described and critically assessed from the point of view of 
the experimental psychology of learning. Many deficiencies are pointed out and 
suggestions for improvements are made. There is good evidence to support the 
use of instrumental rather than classical conditioning, and of electrical rather 
than chemical aversion. While results to date are encouraging, they are by no 
means conclusive. It is strongly emphasized that future progress lies in maintain- 
ing close links with general experimental psychology and with clinical psychiatry. 
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Prior to the 1950s, there appears to be only 
one reference (Max, 1935) to the treatment of 
sexual deviations by aversion therapy. Max 
Tequired a homosexual patient to fantasize 
the attractive sexual stimulus in conjunction 
with electric shock, hence employing a clas- 
sical conditioning approach. He found it 
Necessary to use a shock higher than that 
Usual in laboratory studies, to cause a “di- 
minution of the emotional value of the sexual 
stimulus.” This lasted for several days after 
each experimental period, and over 3 months 
the effect was cumulative. Max reports that 
4 months after the end of treatment, the 
patient said: “The terrible neurosis has lost 
the battle, not completely but 95 per cent 
of the way.” No further details are given of 
the long-term effect of this pioneering piece 
of work, 

In the last few years, the great stimulus 
given to behavior therapy approaches by the 
Work of Wolpe (1958), for the most part 
Concentrated on the treatment of disorders 
such as phobias and obsessions, has also had 
an effect in the field of sexual deviations. A 
previous review of learning approaches to the 
treatment of sexual deviations has been given 
by Rachman (1961). No doubt more con- 
cerned to make a case for the clinical useful- 


ness of these approaches, Rachman made little 
attempt to assess them critically. The present 
paper will describe aversion therapy tech- 
niques used to date, and also offer both specific 
and general criticisms of the methods em- 
ployed. 

There are two main arguments in favor of 
applying learning-theory techniques to the 
treatment of sexual deviations. Firstly, the 
outcome of treatment by various psychothera- 
peutic techniques, despite the optimism ex- 
pressed by one or two authors, for example, 
Allen (1956) and Ellis (1956), is rather 
poor. Curran and Parr (1957) found the rate 
of improvement to be no greater in 25 of their 
cases treated by psychotherapy than in 25 
others who received little or no treatment. 
Woodward (1958) has reported a series of 
homosexual patients referred by the courts 
and treated at the London Institute for the 
Study and Treatment of Delinquency. Out of 
113 referred for treatment, data are reported 
for only 64 who either completed treatment 
or left for some good reason. Only seven 
patients had no homosexual impulse and an 
increased heterosexual interest at the con- 
clusion of their psychotherapy. Of these seven, 
all were bisexual at the onset of treatment, 
three having a Kinsey rating of 1 or 2 
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(Kinsey, Pomeroy, & Martin, 1948). Attempts 
made to obtain follow-up data were somewhat 
sketchy and inconclusive. With respect to 
psychoanalytic claims, Rubinstein (1958) 
was cautious: “Psycho-analysis can help to 
a certain extent and for a fair number. Some 
improve well beyond the original expectation.” 
This recalls Freud’s (1938) statement quoted 
in Jones (1964): “In a certain number of 
cases we succeed . . . in the majority of 
cases it is no longer possible . . . the result 
of our treatment cannot be predicted [p. 
624].” A large-scale psychoanalytic study has 
been reported by Bieber et al. (1963). Out of 
100 homosexual patients treated by full-scale 
psychoanalysis, 27% were solely hetero- 
sexual at the close of treatment. Those pa- 
tients who did improve had all had hetero- 
sexual experience up to intercourse at some 
stage prior to treatment. Moreover, the au- 
thors report their results only at the close 
of treatment and give no follow-up data. 

It is of interest that the major effort in 
the behavior-therapy field to date has been 
applied to such problems as phobias and 
obsessions in which the record of psycho- 
therapy by no means supports an attitude of 
therapeutic pessimism. Indeed, ina controlled, 
albeit retrospective, comparison between be- 
havior therapy and psychotherapy (Marks 
& Gelder, 1965; Cooper, Gelder, & Marks, 
1965) the outcome was only slightly, if at 
all, in favor of behavior therapy. 

The second argument concerns the intrinsic 
interest of applying learning-theory principles, 
derived in the laboratory, to a field in which 
the problem is one of real-life behavior. Sexual 
behavior may be described as consisting of 
two components, an intrinsic mediational com- 
ponent and an extrinsic behavioral com- 
ponent. The possibility of directly manipulat- 
ing the latter and hence of influencing the 
former is theoretically, at any rate, quite 
evident. Clearly, most of the operant re- 
sponses involved in homosexual behavior can- 
not be reproduced in a laboratory setting and 
are not, therefore, available for manipulation. 
However, homosexual behavior can be con- 
sidered as being frequently initiated by the 
visual response of looking at an attractive 
sexual object. At least one sexual response 
is thus available for laboratory manipulation, 
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and is utilized in almost all the techniques to 

be described in the present paper. The great 

majority of aversion therapists have used 

classical conditioning, that is, the attempt is 

made to associate anxiety or fear with the 

previously attractive homosexual stimulus. 

Only a small minority have used instrumental 

conditioning, in which the avoidance or 

escape from the punishing stimulus is con- 
tingent on the performance of a specific 
operant response—generally the avoidance of 
the previously attractive stimulus. Irrespective 
of the conditioning technique used, the under- 
lying aim (although this is frequently not 
specifically stated) is to suppress the visual 
response of looking—in reality or fantasy—at 
an attractive but inappropriate sexual stim- 
ulus. It is hoped that this effect will generalize 
over the whole range of homosexual responses. 
In the case of other deviations such as trans- 
vestism and fetishism, responses such as 
looking at or wearing the fetish objects are 
readily available for laboratory manipulation. 
Once again, in the treatment of these devia- 
tions, most workers have used a classical con- 
ditioning technique. 


Aversion Therapy Techniques Applied to 
Sexual Deviations 


A list of reports on the application of 
aversion therapy to homosexuality is given 
in Table 1A, and a list of those concerning 
other sexual deviations appears in Table 1B. 
Each of these is summarized in Tables 1A 
and 1B, together with any criticisms which are 
specific to the particular report. Criticisms ap- 
plicable to several reports, together with sug- 
gestions for improvement, appear later in the 
paper. 

Homosexuality. The report of Max (1935) 
has been described earlier, and in view of 
its very brief nature, no further discussion 
will be given. An important contribution has 
been made by Freund (1960). He adminis- 
tered his patients a mixture of caffeine and 
apomorphine in a number of treatment ses- 
sions never exceeding 24, When the emetic 
mixture became effective, slides of dressed 
and undressed men were shown to the patient. 
During a second phase of treatment, the pa- 
tient was shown films of nude or semi-nude 
women 7 hours after he had been administered 
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testosterone propionate. Sixty-seven patients 
we reported on in the paper; treatment was 
refused to none. Out of 20 court referrals, 
oly 3 achieved any kind of heterosexual 
adaptation, and in no case did this last for 
more than a few weeks. The first follow-up 
yas carried out after 3 years. Out of the 47 
patients who presented other than due to 
court referral, 12 had shown some long- 
term heterosexual adaptation. A second fol- 
low-up 2 years later traced the histories of 
these 12. At that time none of them could 
daim complete absence of homosexual desires, 
and only six could claim complete absence 
of homosexual behavior. Three of the group 
were, in fact, practicing homosexuality fairly 
frequently. Ten of them had heterosexual 
intercourse at least once every two weeks, 
but only three found females other than 
their wives sexually desirable. Moreover, 
ten before treatment, two patients had be- 
come adapted to heterosexual intercourse. 
Clearly these results do not encourage an 
attitude of optimism either to the use of 
chemical aversion or to a classical conditioning 
approach. Freund’s series is, however, the 
only one in the field which includes a satis- 
factorily long follow-up and in this respect, 
therefore, is a model of its kind. 

James (1962) used apomorphine in the 
treatment of a 40-year-old homosexual with a 
Kinsey tating of 6. The treatment was rather 
more rigorous than that of Freund, being 
carried out at 2-hour intervals. As soon as 
nausea occurred, a strong light was shone 
Mto a large piece of cardboard on which 
wee pasted several photographs of nude or 
Sminude men. The patient was asked to 
select an attractive one, and to recreate the 
“perlences he had with his current homo- 
Stal partner. This fantasy was verbally 
Tnforced by the therapist on the first two 
ot three occasions; thereafter a tape record- 
Fide played twice every 2 hours during 

period of nausea. This consisted of an 
Splanation of his homosexual behavior, to- 
Stther with the effect of this on him, such 
a “sickening” and “nauseating” being 
hee ed to the social consequences. The treat- 
i Ae carried out for a period of 30 hours, 
Poe ae later was repeated for 32 hours 

: the following night the patient was 
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awakened every 2 hours and was played a 
tape recording which optimistically explained 
the future consequences of his no longer being 
homosexual. During the 3 days following 
aversion treatment, photographs of sexually 
attractive young females were placed in his 
room, and each morning he received an in- 
jection of testosterone propionate and was 
told to retire to his room whenever he felt 
any sexual excitement. 

The reader may feel somewhat bewildered 
by the mixture of techniques involved in this 
treatment. However, the outcome was highly 
satisfactory in that there was a complete 
change from homosexual to heterosexual be- 
havior, although the follow-up was only 5 
months. The poor long-term results obtained 
by Freund in his large series should caution 
against placing too much weight on the out- 
come of James’s single case. 

We now have three papers from Thorpe 
and his colleagues. They tried three separate 
techniques for their first patient (Thorpe, 
Schmidt, & Castell, 1963). In the first of 
these, the patient was placed in a small room 
in front of a picture of a female which was 
visible only when illuminated by the psy- 
chologist. The patient was instructed to 
masturbate, using whatever fantasy he wished. 
He was told to report when orgasm was being 
reached. At this point the female picture was 
illuminated until the patient reported he had 
finished ejaculation. After 11 such trials, there 
was no change in the patient’s masturbatory 
fantasy, which remained homosexual. The 
authors concluded that they may have been 
attempting backward conditioning by present- 
ing the picture after reinforcement had com- 
menced. They therefore tried a second tech- 
nique, in which the female picture was il- 
luminated for 1 second at random intervals 
during masturbation. The number of illumina- 
tions was increased until in the later trials 
the picture was more under illumination than 
in darkness. This method was also unsuccess- 
ful, and was therefore dropped. A third 
technique was then introduced, and was 
carried out in a room with a floor area of 
9 square feet, which was completely covered 
by an electrical grid. This technique consisted 
of a combination of positive and negative 
conditioning trials conducted separately. The 
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former were those described in connection with 
the second technique and 38 trials were given. 
The negative trials were carried out by il- 
luminating one of the patient’s own photo- 
graphs of a nude male (we are not told for 
how long) and at the same time delivering 
a strong electric shock through the grid to 
his bare feet. The shock was turned on 4 to 1 
second after the picture had been illuminated. 
A variable interval/variable ratio schedule of 
reinforcement (VI/VR) was used in conjunc- 
tion with a classical conditioning technique. 
Within each trial, the picture was illuminated 
40 times. On nine of these occasions, ran- 
domly selected, the patient was shocked. 
Usually, five trials were given per session, 
although on occasion this was increased to 
10 trials. Each trial took 10 minutes, and 
100 trials were given in all. It can be estimated 
that the patient was receiving between 200 
and 400 illuminations of the male slide in each 
session. This large number of stimulus pres- 
entations introduces, in the event of a suc- 
cessful outcome of treatment, the possibility 
of stimulus satiation providing the explanatory 
factor in addition to the hypothesized clas- 
sical conditioning. Follow-up contact appears 
to have been by letter. The patient reported 
utilizing heterosexual fantasy, and stated that 
he had had one attempt at heterosexual inter- 
course. Occasional homosexual patterns of 
behavior had occurred, but the patient was 
not unduly worried about these, which he 
regarded as a safety valve. Whereas, before 
treatment, he had only considered young men 
and boys, he now considered persons of both 
sexes. The authors admit that many would 
consider this patient to have technically re- 
lapsed. However, they predict a satisfactory 
heterosexual adjustment for him, and they 
therefore consider his treatment to have been 
successful. 

Thorpe and Schmidt (1963) next report 
a case which they describe as a therapeutic 
failure. The patient was required to stand in 
the same small room as used for the previous 
patient. A male picture was illuminated for 
1 second, 84 times in 15 minutes at random 
intervals, and in 30 of these, randomly inter- 
spersed, the illumination was accompanied by 
a shock to the feet which began 4 second 
after the picture had appeared; that is, they 
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again used classical conditioning and a VI/VR 
schedule. Kimble (1961) reports evidence 
that partial reinforcement hinders respons 
acquisition by classical conditioning. In view 
of this fact, Thorpe’s use of partial reinforce 
ment is rather surprising. A further criticism 
concerns the very short stimulus exposure 
time. This hardly seems sufficient for the 
patient both to light adapt and to perceive 
the content of the stimulus. The patient 
received three sessions of treatment over 4 
period of 2 days, after which he refused 
further treatment. It is by no means cleat 
why a patient who received only a very few 
sessions should be described as a therapeutic 
failure, while one who received a much larger 
number of sessions of somewhat similar treat- 
ment should be described as a success despite 
“a technical relapse.” 

The third paper contributed by Thorpe and 
his colleagues (Thorpe, Schmidt, Brown, & 
Castell, 1964) describes a technique termed 
“Aversion relief therapy: A new method for 
general application.” The range of usefulness 
for which they hope is illustrated by its ap 
plication to three homosexuals, one trans 
vestite, one motorcycle fetishist, one phobic 
one obsessional and one compulsive over 
eater. The authors argue that verbal repre: 
sentations of behavior can be substituted for 
actual behavior with no loss of effectiveness 
citing as support Wolpe’s symbolic method o 
systematic desensitization (Wolpe, 1958) 
They also argue for the necessity of terminat 
ing aversive conditioning by a relief stimulus 
which should in some way be associated with 
the behavior desired by the patient as a sub 
stitute for his socially unacceptable behavior 
They prepared for each patient a disc which 
had up to 24 appropriate words typed on it 
for example “homosexual” and its synonyms 
the last word being a different one in ead 
case. For instance, for the homosexual patient 
a “relief” word such as “heterosexual” wa 
used, and was never associated with shock 
The patient was told to read the word alou( 
as it appeared in an illuminated aperture fo 
about 25 seconds, about 10 seconds bein 
allowed between each word. As he did so, hi 
received a shock. If he failed to do so, hi 
received a more intense shock. In each sessio 
there were five such trials, each having í 


different number and order of words, but al- 
ways terminated by a relief word. Each trial 
was separated by 5 minutes, and about one 
session per day was administered. The ra- 
tionale behind shocking the patient as he 
reads a word clearly involves classical condi- 
tioning. That behind shocking him when he 
fails to do so is somewhat less clear. It could 
be argued in criticism that he is shocked for 
avoiding the previously attractive stimulus, 
the avoidance response having been set up by 
the classical conditioning procedure. An anal- 
ogy would be the work of Solomon and Wynne 
(1953), who used an instrumental technique 
to set up an anticipatory avoidance response 
in dogs. They found that in order to extin- 
gush this response, they had to resort to 
special procedures, described in Solomon, 
Kamin and Wynne (1953). One of these in- 
volved shocking the dog for carrying out the 
avoidance response. The method of Thorpe 
and his colleagues appears to involve the risk 
of doing this. At the very least, the situation 
appears less than clear-cut. Of the three homo- 
sexual patients described in this paper, all of 
whom responded well and who received be- 
tween 14 and 30 sessions of treatment, only 
one appears to merit a Kinsey rating of more 
than 2, The majority of patients appearing 
for treatment for homosexuality at psychiatric 
dinics have a Kinsey rating of at least 4, so 
that the value of the technique requires much 
more supporting evidence. In addition, the 
maximum follow-up for any of the three was 4 
Weeks, 

McGuire and Vallance (1964) describe 
what they state to be a classical conditioning 
technique. The patient is required to signal to 
the therapist when the image of his usual 
antasy is clear, When he does so, a shock is 
administered, The procedure is repeated 
throughout a 20 to 30 minute session which is 
eld up to six times per day. McGuire has 
designed a small and completely portable 
electrical apparatus used in the treatment, 
and this is usually handed over to the patient 
So that he can treat himself in his own home. 
He is told to use the apparatus whenever he 
'S tempted to indulge in the fantasy con- 
cerned, One doubt concerning this technique 
in in the interpretation of the term “clear.” 

this means that the patient has achieved a 
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complete representation of his usual fantasy, 
the point is raised that at best the authors 
may be carrying out a variety of punishment 
learning (Estes, 1944). In this paradigm, the 
noxious stimulus occurs following the comple- 
tion of the undesirable response. Estes showed 
extinction of a newly acquired avoidance re- 
sponse to be particularly rapid with this 
technique. 

It is, of course, left to the patient himself 
to set the level of the shock, and this raises 
an additional and even more serious criticism. 
Sandler (1964) has provided a most useful 
discussion of the concept of masochism, de- 
fined as the situation in which a noxious stim- 
ulus does not result in the subject receiving it 
displaying avoidance behavior. Conversely, 
the noxious stimulus appears to be not only 
tolerated but even sought after. He provides a 
large number of experimental analogues to 
various clinical forms of masochistic behavior, 
so that in the most dramatic examples organ- 
isms have been found actually working for 
punishing results. In one variety of maso- 
chistic behavior, relevant in the present con- 
text, aversive stimuli might be paired with 
the reinforcer which follows a given activity. 
The end result may be that “the aversive 
stimulus becomes positively reinforcing in the 
same process [Skinner, 1953, p. 367].” This 
view is in accordance with analytic thinking. 
For instance, Fenichel (1945) states: “Cer- 
tain experiences may have so firmly estab- 
lished the conviction that sexual pleasure 
must be associated with pain, that suffering 
has become the prerequisite for sexual pleas- 
ure [p. 357].” It might well be that self- 
treating patients will use a fairly low level of 
shock. This level then becomes associated 
with a very well-reinforced event, so that it 
serves as a positive reinforcer. Thus, far from 
the electrical shock being averting, it might 
become part of the normal fantasy situation. 

McGuire and Vallance (1964) present 
treatment results for six homosexual patients 
of whom three discontinued treatment and 
three showed an improvement which could 
have been either mild, good, or symptom re- 
moved; it cannot be concluded which from 
their description. They state that the follow- 
up time in most cases was 1 month; a much 
longer one is clearly required. In a later paper 
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(McGuire, Carlisle, & Young, 1965) these 
workers claim “at least as good results as the 
more elaborate mimes and cues provided by 
other aversion therapists [p. 187].” Unfor- 
tunately they provide no data to support their 
claim. 

Solomon and Wynne (1953) and Turner 
and Solomon (1962), using both dog and 
human subjects and a neutral CS, demon- 
strated that the technique of anticipatory 
avoidance learning set up avoidance responses 
which were very highly resistant to extinction. 
Solomon (cited by Eysenck, 1964), using dogs 
as his subjects, and Aronfreed and Reber 
(1965), using children, have also shown the 
effectiveness of the technique in setting up 
highly stable avoidance responses to an at- 
tractive CS—a point particularly relevant in 
the present context. Feldman and MacCulloch 
(1964, 1965) have adapted the anticipatory 
avoidance technique to the clinical situation 
in the following way: The homosexual patient 
views a male slide which is back-projected on- 
to a screen. He is instructed to leave the 
picture on for as long as he finds it attractive. 
After the slide has been on the screen for 8 
seconds, the patient receives a shock if he has 
not by then removed it by means of a switch 
with which he is provided. If he does switch 
off within the 8-second period, he avoids the 
shock. Once the patient is avoiding regularly, 
he is placed on a standardized reinforcement 
schedule which consists of three types of 
trials, randomly interspersed. The first type 
consists of reinforced trials (the patient’s 
attempt to switch off succeeds immediately). 
The second consists of delay trials (the pa- 
tient’s attempt to switch off is held up for 
varying intervals of time within the 8-second 
period. He does, however, eventually succeed 
in avoiding.) Finally, one-third of all trials are 
nonreinforced (the patient does receive a 
shock irrespective of his attempts to switch 
off). In addition, on two-fifths of the trials, 
selected at random, a female slide is projected 
onto the screen contiguous with the offset of 
the male slide, and is left on for about 10 
seconds. This is then removed by the therapist 
and the patient can, if he wishes, request that 
it be returned. However, his request is met in 
an entirely random manner. A further feature 
is the use of hierarchies of male and female 
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slides. The patient places the slides in their 
order of attractiveness for him so that treat- 
ment starts with the least attractive male slide 
being paired with the most attractive female 
slide; the two hierarchies then being moved 
along simultaneously. The interstimulus in- 
terval varies between 15 and 35 seconds ran- 
domly, and about 25 stimulus presentations 
(trials) are given per session, which lasts for 
about 20 minutes. This is a VI/VR reinforce- 
ment schedule, but in the context of an in- 
strumental conditioning technique. 

In a paper prepared for publication in Sep- 
tember 1964, Feldman and MacCulloch 
(1965) reported that of their 16 patients who 
had at that time completed treatment, 10 had 
shown a complete absence of homosexual 
practice, together with a complete or almost 
complete absence of homosexual fantasy. In 
addition, these 10 were either actively prac- 
ticing heterosexually or had strong hetero- 
sexual fantasies. The authors propose 3 
change of this nature as a reasonable cri- 
terion of improvement. At that time, th 
follow-up period varied between 1 and 1 
months. Twenty-six patients have complete 
treatment at the present time (July, 1965) 
and have been followed up for at least í 
months—the maximum follow-up is now í 
years. Eighteen patients have shown the kin 
of improvement which is defined above. Eigh 
patients are wholly or largely unchanged. Tel 
further patients are either under treatment a 
the present time or have recently complete 
treatment but have been followed up for les 
than 3 months. Of this total sample, two 
thirds had a pretreatment Kinsey rating of | 
or 6. All of the remaining patients had i 
Kinsey rating of at least 3. It is planned t 
extend the present series until it numbers ap 
proximately 40, and then to set up and cari’ 
out a fully controlled prospective trial. 

One major criticism which can be made 0 
this technique is that the reinforcemel 
schedule is somewhat rigid (one-third of a 
trials are nonreinforced). It would probabl 
be more satisfactory to gradually reduce th 
proportion of shock trials so as to increasing! 
approximate real-life conditions. 

Other Deviations. Raymond (1956) au 
Oswald (1962), treating respectively a perat 
bulator fetishist and a rubber mackintos 
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fetishist, used substantially the same tech- 
nique. The description provided by Raymond 
will serve for both papers. His patient was 
shown a collection of handbags, perambulators, 
and colored illustrations immediately after 
receiving an injection of apomorphine and 
just before nausea was produced. The treat- 
ment was given every 2 hours, day and night. 
No food was allowed, and at night ampheta- 
mine was used to keep him awake. After a 
week of this regime, he spent 8 days at home. 
He then had several further days of the same 
type of treatment. He remained well for 3 
years, at which time he began to find control 
more difficult, and received a further course of 
treatment. At the time of the last follow-up, 
2 years later, he was still doing well (follow- 
up data in Coates, 1964). Oswald (1965) has 
reported a similarly long-term (54 months) 
follow-up of his fetishist patient. Cooper 
(1963) used emetine as the aversive stimu- 
lus and, as in the cases of Raymond and Os- 
wald, employed classical conditioning with a 
100%-reinforcement schedule in the treat- 
ment of a female clothes fetishist. In this 
case, the patient was actually required to 
catty out his fetishistic acts. With the onset 
of nausea and vomiting, the patient was re- 
tuned to bed and received intensive moral 
suggestion. During the whole of the day he 
Was not allowed to discard his female clothes, 
but was instructed to look at his reflection in 
the mirror and to reenact in his mind every 
detail of his “disgusting perversion.” The pa- 
tient was kept awake at night by means of 
amphetamine, and a tape recording was 
played every 2 hours for 20 minutes. The 
Patient finally broke down after 7 days of this 
regime, having neither eaten nor slept for 6 
(ays, Three days after treatment, a right 
ventricular stress was noted and this was con- 
sidered to be due to a toxic myocarditis pro- 
duced by emetine. Nine months after treat- 
ment he was still not practicing his fetishism, 
and was having normal intercourse with his 
wife, A very similar technique was used by 
Clark (1963), again with a female clothes 
fetishist, The emphasis on disgust placed by 
Some therapists is shown by the following 
Phrase from Clark: “At one session, by a 
Particularly happy chance, one of his favourite 
Pictures fell into the vomit in the basin so 


that the patient had to see it every time he 
puked [p. 405].” The follow-up was over a 
3-month period, at the end of which the pa- 
tient was still doing well. 

Thus far, we have only single case studies, 
from which conclusions are notoriously dif- 
ficult to draw. Barker (1965) reports the 
treatment of two patients. His first patient 
was treated with the use of apomorphine as 
the aversive stimulus, and slides of the pa- 
tient in his female clothing were used as condi- 
tional stimuli. Reinforcement was 100%, and 
68 treatment trials were given every 2 hours 
for 6 days and nights. The patient went 
abroad so that follow-up was difficult, but 
Barker states that as far as he knew the pa- 
tient had remained symptom free for 18 
months. Despite the apparent therapeutic ef- 
fectiveness of chemical aversion for this pa- 
tient, Barker found its disadvantages to be 
so substantial as to outweigh its advantages, 
and some of the criticisms he made will be 
listed in a later section. Barker’s second pa- 
tient was treated with the use of electrical 
aversive stimulation (as described by Thorpe, 
Schmidt, and Castell, 1963) and his wearing 
female clothes as the conditional stimulus. 
Treatment sessions, each consisting of five 
trials, were administered every $ hour, with 1 
minute between each trial. Four hundred trials 
were given in all, over a 6-day period. The 
patient began to dress at the beginning of 
each trial and continued to do so until sig- 
nalled to undress, either by shock from the 
grid or by a buzzer, randomly interspersed 
over the 400 trials. The signal recurred at in- 
tervals of 5, 10, or 15 seconds, randomly 
interspersed, until he was undressed, and the 
interval between commencement of dressing 
and signal onset was randomized at between 1 
and 3 minutes. For the last 75 trials, no dif- 
ference in rates of undressing between shock 
and nonshock trials were observed. No data, 
however, are given for the first 325 trials. 
Except for one isolated relapse, the patient is 
described as being symptom free for 14 
months after treatment. Barker considered 
that a classical conditioning paradigm could 
not account for his successful result, and that 
the instrumental act of undressing associated 
with shock escape carried out on a 50% rein- 
forcement schedule suggested that instru- 


74 


mental conditioning was playing an im- 
portant part in symptom relief. 

This argument is strengthened by a series 
of 19 transvestist patients reported by Mor- 
genstern, Pearce, and Rees (1965). Of these 
patients, six refused treatment, six relapsed 
after completing treatment but practised much 
less frequently than prior to treatment, and 
seven ceased to cross-dress altogether. The 
13 patients who completed treatment received 
39 sessions, given three times per day. Apo- 
morphine was used as the aversive stimulus, 
and injections were graded so that the peak 
effect of nausea and vomiting coincided with 
the completion of the cross-dressing ritual. 
The authors report on the relationship of a 
battery of tests to the treatment outcome, the 
most significant relationship being that be- 
tween outcome and a verbal conditioning (in- 
strumental) procedure, whereas eye-blink con- 
ditioning (classical) showed no relationship 
with outcome. Only the cured patients showed 
evidence of verbal conditioning. Morgenstern, 
Pearce, and Rees conclude that an instru- 
mental form of conditioning is involved in 
aversion therapy. 

MacCulloch, Feldman, and MacCulloch, 
using the same instrumental anticipatory 
avoidance technique as described above for 
the treatment of homosexuality, treated a 
group of six patients displaying various types 
of sexual deviation other than homosexuality. 
In the case of two transvestists, the sexual 
stimulus (CS) used was a photograph of the 
patient in his female clothing, and in the 
case of the other patients, the CS was the 
preferred sexual object. Relief stimuli were 
used for four of these patients; in three cases 
these were photographs of the patients’ wives. 
Four of the six patients ceased their sexually 
deviant behavior; the present follow-up vary- 
ing from 2 to 18 months. 

McGuire and Vallance (1964) reported 
eight varied cases of sexual deviation treated 
by associating fantasy with self-administered 
electric shock. All eight were stated to have 
improved. In no case did the follow-up ex- 
ceed 1 month. 


1A detailed report of treatment methods and 
patients’ case histories is in preparation and will be 
given when a somewhat longer follow-up time has 
elapsed, 
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Finally, Thorpe, Schmidt, Brown, and Cas 
tell (1964) using the technique applied by 
them to homosexuality and described earlier, 
treated one transvestist (with success) and one 
motorcycle fetishist. The former had been 
followed up for 2 weeks and the latter was 
still in treatment at the time of the report. 


Discussion 
Derivation from Experimental Findings 


Very few of the techniques described above 
have been derived in any logical way from 
the general body of the experimental psy- 
chology of learning. Most of the papers 
quoted contain little or no discussion of the 
kind of predictions for treatment which leam- 
ing theory would be expected to make. Ey- 
senck (1965) has severely criticized this de 
ficiency in the following terms: “For all the 
attention that is being paid to them by prac 
titioners in the field, the theoretical and er 
perimental advances in learning and condi 
tioning methodology might just as well not 
have taken place [p. 12].” It would be ut 
fortunate if behavior therapy were to pai 
from the general body of experimental psy 
chology and become an entirely separate field 


The Choice of Learning Paradigm 


In contrast to the follow-up reports o 
phobic patients, (Cooper, Gelder, & Marks 
1965) which indicated a substantial rate 0 
posttreatment spontaneous remission, sexta 
deviations appear rather to tend to relaps 
following treatment (cf. Freund, 1960). Iti 
therefore mandatory to adopt the learnin 
technique which is likely to be most resistan 
to extinction, as has been pointed out b 
Eysenck (1963). It is rather surprising tha 
so many of the authors cited in Tables 1 
and 1B should have chosen to use classica 
conditioning, the effects of which are rathe 
poorly resistant to extinction (Solomon € 
Brush, 1956). These authors report that it 
strumental avoidance learning techniques 
such as anticipatory avoidance—appear to b 
by far the most highly resistant to extin 
tion. It was pointed out earlier in the presen 
paper that sexual behavior appears to have: 
very strong operant component, and the com 
ments and results of Morgenstern, Pearce, 4! 
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Rees (1965) and Barker (1965) support the 
case for an instrumental rather than a classical 
technique. Turner and Solomon (1962) have 
shown that within the context of an instru- 
mental technique the subject should perform 
an operant response, rather than a reflexive 
one, for the greatest resistance to extinction. 

It is desirable to incorporate into the train- 
ing situation those variables which have been 
shown to further increase resistance to extinc- 
tion, As listed by Feldman and MacCulloch 
(1965), some of these are as follows: 


1. Learning trials should be distributed 
rather than massed. 

2. Contiguity of stimulus and response, par- 
ticularly at offset, should be maintained 
throughout. 

3. Shock should be introduced at whatever 
level has been found to be unpleasant for the 
patient rather than gradually increased, thus 
possibly enabling the patient to habituate. 

4, Partial reinforcement should be used in 
conjunction with instrumental techniques. 

5. Reinforcement should be variable rather 
than fixed, both in ratio and in interval 
schedules, 

6. There is a good deal of data which sug- 
gst that delaying a proportion of the pa- 
tients attempts to avoid should lead to 
greater resistance to extinction than immedi- 
ate reinforcement. 

7. In general, the greater the variation in 
the conditions of training the more will these 
approximate to the real life situation, thus 
avoiding, as far as possible, generalization 
decrement, probably the most potent source 
of rapidity of extinction. 


Choice of the Aversive Stimulus: Chemical 
or Electrical 


Both Rachman (1965) and Barker (1965) 
Point out that chemical aversion is highly un- 
Pleasant, not only for the patient but also 
for the therapist and the nursing staff; there 
'Salso some evidence that it brings about in- 
creased aggressiveness on the part of the pa- 
Ment, Several other advantages of electrical 
aversion, as listed by Rachman, are as fol- 
Ea precision of control, manipulability of 
anables, the possibility of using partial rein- 
orcement, and the possibility of more ac- 


curate measurements of the progress of treat- 
ment. In connection with this last point, Mac- 
Culloch, Feldman, and Pinschof (1965) report 
data on the measurement of avoidance response 
latencies and pulse-rate changes during the 
treatment of homosexual patients by electrical 
aversion. It will be recalled that Barker 
(1965), in comparing the use of chemical and 
electrical aversive techniques, used only two 
patients. As Raymond (1965) points out, 
this sample is somewhat small. Raymond 
argues, without however presenting any more 
evidence than the subjective report of one 
patient, for the use of chemical rather than 
electrical aversion. It certainly seems rea- 
sonable to point out, in view of its much 
more harmful side effects and the much 
greater pressure on time of staff, that unless 
chemical aversion produces a better outcome 
than electrical aversion, there is no case for 
its use. 


The Choice of the Conditional Stimulus 


The patient may not always be able to 
reproduce his fantasy, and merely handing 
him pictures to look at may not be sufficient 
to hold his attention in face of the distractions 
of the treatment room. A mechanically pro- 
jected picture has the advantages of clarity 
of reproduction and ease of control. With 
respect to the treatment of transvestists and 
fetishists, the choice appears to be between 
the patient wearing or handling his fetishistic 
stimuli, and of having pictures of them 
presented by some mechanical means. Once 
again, the argument concerning ease of con- 
trol supports the use of slides of the fetishistic 
stimulus. However, in those cases in which 
tactile stimulation forms the central element 
of the fetish, the use of slides is clearly in- 
adequate. Wherever possible, the patient 
should provide his own photographs of satis- 
fying real-life stimuli to reduce the problem 
of generalization decrement. 


The Introduction of an Alternative Response 


A major problem is to substitute for the 
deviant sexual behavior a form of sexual 
outlet that is both desired by the patient and 
is socially possible. This will involve hetero- 
sexual fantasy and, ideally, overt heterosexual 
behavior. Several techniques, particularly 
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those involving chemical aversion, make little 
or no attempt to carry out this substitution. 
The more refined experimental control en- 
abled by the use of electrical aversion makes 
it readily possible. For most homosexual pa- 
tients, female sexual stimuli are either neutral 
or unpleasant, and therefore evoke avoidance 
rather than approach responses. The fact that 
patients feel relief (hence the term “relief” 
stimulus) when the male stimulus is removed, 
suggests a method of changing these re- 
sponses to female stimuli. In the techniques 
of Feldman and MacCulloch (1965) and 
Thorpe, Schmidt, Brown, and Castell (1964) 
this is achieved by introducing the female 
stimulus contiguous with the removal of the 
male stimulus. Kimble (1961) has summed 
up the experimental data on the use of relief 
stimuli as follows: “Stimuli associated with 
the cessation of shock are secondary rein- 
forcers. Looking at it another way, they take 
on a value seemingly opposite of that ac- 
quired by stimuli which accompany shock on- 
set [p. 176].” That is, the fact that the fe- 
male slide is associated with the cessation of 
pain increases the likelihood that it will ac- 
quire positive reinforcing properties. It has 
been found helpful (Feldman & MacCulloch, 
1965) to use photographs of wives or girl- 
friends as part of the hierarchy of female 
pictures wherever possible. This, of course, 
helps to reduce generalization decrement. The 
use made of a relief stimulus by Thorpe, 
Schmidt, Brown, and Castell (1964) can be 
criticized on the grounds of its predictability 
—it always appeared on the last stimulus 
presentation, indicating the end of the session, 
rather than the avoidance or cessation of pain. 
Moreover, the fact that a verbal stimulus is 
used makes the situation not particularly 
realistic. 

McGuire, Carlyle, and Young (1965) in- 
structed their patients that whatever the ini- 
tial stimulus to masturbation, the fantasy in 
the 5 seconds just prior to orgasm must be of 
normal sexual intercourse. Thorpe, Schmidt, 
and Castell (1963) criticize their own some- 
what similar technique on the grounds that 
its failure probably involved backward con- 
ditioning. A further objection can be made— 
several of the patients treated at Crumpsall 
Hospital have of their own accord attempted 
to fantasize females late in the masturbatory 
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sequence. A frequent consequence has been 
detumescence, thus adding a further incre- 
ment of strength to the habit of not approach- 
ing females. 

It is suggested that the most effective com- 
bination is to introduce female photographs 
as relief stimuli in order to initiate approach 
responses to females, and then to gradually 
shape these in the manner described by 
Ferster (1965). He argues that it is desirable 
to start with a response which is relatively 
likely to be reinforced, such as simply speak- 
ing to females, and then to proceed up the 
hierarchy of responses which are increas- 
ingly less likely to be reinforced, not pro- 
ceeding to the next one in the hierarchy 
until the preceding one has been very well 
established. The experience of the author and 
has colleagues is that it may take as long as 
6 months before sexual approach responses to 
females are really well established, particu- 
larly in those patients who have either never 
had any sexual attraction to females or have 
not had such attraction for very long periods 
of time, say, 10 years or more. 


The Use of Stimulus Hierarchies 


For the purposes of the present discussion 
we shall confine our attention to homosexual 
patients, but the argument which follows ap- 
plies equally to those displaying other sexual 
deviations. The great majority of techniques 
described in the present paper introduce male 
or female stimuli quite unsystematically, 
without regard to their relative degree of 
attractiveness or repulsion to the patient. At 
the onset of treatment the dominant sexual 
response is a homosexual one, the hetero- 
sexual response being nondominant. The prob- 
lem is to design the treatment so as to re- 
verse this situation. It is suggested that the 
order of presentation of conditional stimuli 
is a variable of considerable importance. In 
Wolpe’s (1958) desensitization technique, 
principally applied to problems of approach 
learning such as phobias, the patient begins 
his treatment by exposure to a situation which 
is only slightly anxiety provoking, moving on 
to a more difficult situation when the previous 
step in the hierarchy is no longer evoking 
anxiety. Applying this principle to avoidance 
learning in the treatment of homosexuality, it 
seems logical to begin with a male stimulus 


AVERSION THERAPY FOR SEXUAL DEVIATIONS 77 


which is only mildly attractive and to which 
an avoidance response may be set up with 
lative ease. It follows that if a female 
stimulus is introduced contiguous with the 
dst of the male stimulus, it should be as 
atractive to the patient as possible. This 
further increases the ease of setting up an 
avoidance response to the male stimulus 
(Kimble, 1961) as well as increasing the 
strength of approach to the female stimulus. 
Once the avoidance response to the first male 
stimulus and the approach response to the 
first female stimulus are well established, the 
patient can be taken along hierarchies of 
ascending attractiveness of male stimuli and 
descending attractiveness of female stimuli. 
The principles of underlying the use of stimu- 
lus hierarchies are so well established in ex- 
perimental psychology that their neglect by 
aversion therapists is rather surprising. 


Clinical Factors 


Some of the reports described above are 
marked by a rather moralistic overtone. This 
i particularly clear in the papers by Cooper 
(1963) and Clark (1963), in both of which 
the authors used tape recordings which 
stressed the “disgusting and unpleasant” na- 
tre of the patient’s sexual deviation. Apart 
fom the fact that this, particularly when 
ised together with chemical aversion, might 
tender the whole situation so unpleasant as to 
fore the patient into a “flight into health,” 
a further factor should also be mentioned. 
This is the unsuitability of the therapist ex- 
Messing strong opinions concerning the pa- 
tient’s practices, particularly in the absence 
of evidence that such a degree of hectoring 
condemnation is essential to the effective out- 
Come of treatment. 

Westwood (1960) has pointed out that 

rapists tend to see an atypical sample of 

Omosexuals. It is possible that this atypi- 
tality resides in the marked incidence of 
Psychiatric disturbance displayed by homo- 
sexuals presenting themselves at psychiatric 
cinies, Tn many of the reports which are re- 
Mewed above, there is an inadequate psy- 
chiatric description of the patients concerned. 
ist fewer than 18 of the 26 homosexual pa- 
ents in Feldman and MacCulloch’s series 
Pal Table 14) who have completed aversion 

erapy treatment to date have displayed 


psychopathology, ranging from a depressive 
psychosis to acute or chronic personality dis- 
orders. It is extremely important, therefore, 
that there be full psychiatric participation in 
aversion therapy treatment to enable the di- 
agnosis of coexisting psychopathology, and 
the treatment of this where necessary and 
possible. None of the patients in the Feldman 
and MacCulloch series have received treat- 
ment in addition to avoidance learning other 
than adjuvant drug therapy and supportive 
psychotherapy of a superficial kind. No pa- 
tients have received either drug therapy or 
psychotherapy as the sole or even major por- 
tion of their treatment. 

A full description of the mental state at 
the onset of treatment makes it possible to 
set up predictive relationships between per- 
sonality factors and the outcome of treatment. 
MacCulloch and Feldman ° have found that 
those patients displaying self-insecure per- 
sonality traits (Schneider, 1959) not only 
have a better motivation for treatment but 
show a better response to treatment than do 
those who show attention-seeking personality 
traits (Schneider, 1959). 

All too often, the details of follow-up and 
outcome are extremely scanty. For instance, 
McGuire and Vallance (1964) state that most 
of their patients were followed up for 1 
month. Thorpe’s various patients (Thorpe 
& Schmidt, 1964; Thorpe, Schmidt, Brown, 
& Castell, 1964; Thorpe, Schmidt, & Castell, 
1963) appear to have been followed up for 
about the same length of time. Not only does 
it appear inappropriate and perhaps mis- 
leading to publish successful single cases, 
(therapists are, on the average, less likely to 
report unsuccessful single cases), it is also 
desirable that follow-up should be of at least 
several months’ duration. Such longer follow- 
ups may well provide a very necessary Cor- 
rection of early optimism, particularly as 
most aversion therapists have used classical 
conditioning which shows a rather poor re- 
sistance to extinction. An apparently suc- 
cessful outcome may therefore resist relapse 
for the few weeks of a short-term follow-up, 
but may not survive a longer and correspond- 
ingly more searching assessment. 


2A report of this work is currently under edi- 
torial consideration. 
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Pre- and Posttreatment Assessment 


There is a very strong necessity for an 
objective evaluation of the direction of sexual 
interests before and after treatment which is 
independent of clinical data, and which will 
employ indices different from those involved 
in the treatment situation itself. Brown (1964) 
has described a technique in which the num- 
ber of times a subject operates a shutter to 
reveal a picture is used to index the relative 
intensities of sexual interest. This technique 
has been criticized by Koenig (1965). A re- 
cent, and very promising, objective approach 
which involves eye-pupil responses to sexual 
stimuli is that of Hess, Seltzer, and Shlien 
(1965). Feldman, MacCulloch, Mellor, and 
Pinschof * are developing a sexual approach- 
avoidance scale which combines features of the 
semantic differential technique (Osgood, Succi, 
& Tannenbaum, 1957) and the personal ques- 
tionnaire technique of Shapiro (1961), and is 
used to assess the relative levels of homo- and 
heteroerotic interest prior to and following 
treatment. 


CONCLUSION 


It cannot yet be said that there is an over- 
whelming case for the efficacy of any single 
aversion therapy technique in the treatment 
of any single sexual deviation, although the 
results obtained to date suggest that instru- 
mental techniques are both theoretically more 
likely to be successful than those based on 
classical conditioning, and have also achieved 
a reasonable measure of practical success. A 
major purpose of the present paper is to argue 
the need to derive any aversion therapy treat- 
ment logically from the general body of learn- 
ing theory, rather than to construct ad hoc 
and undigested mixtures of often inappropri- 
ate and mutually contradictory variables. If, 
having carefully derived the treatment tech- 


3 Principal components analyses show the scale to 
be unifactorial both for homosexual patients and 
for heterosexual controls. The scale discriminates 
without overlap between controls and pretreatment 
patients, while improved patients, but not unim- 
proved ones, show a considerable overlap with con- 
trols, One month test-retest reliability in controls is 
0.8, and preliminary results indicate little change in 
homosexual patients over a pretreatment interval 
averaging 2 months, A report on this work is in 
preparation. 
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nique, it then fails, the reason for failure will 
lie in the unsuitability of the technique to the 
problem concerned, particularly if a large 
series of patients has been used so as to over- 
come the bias inherent in the single-case 
method. Should the treatment technique suc- 
ceed, it can then be developed so as to maxi- 
mize its effectiveness. Finally, it is argued 
most strongly that the future value of aver- 
sion therapy techniques depends upon thera- 
pists maintaining their links with, on the one 
hand, general experimental psychology, and 
on the other, with general clinical psychiatry. 
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PERCEPTION-COGNITION 
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Nine experiments designed to investigate the effect of food deprivation on 
perceptual-cognitive processes are examined in detail. An effect is revealed in 
only some of these experiments. The deviating results are explained by as- 
suming that motivational state will not affect perceptual-cognitive processes 
unless the material presented is meaningful in relation to the motivational 
state. An examination of the operational definitions given of the processes 
studied indicate that the processes may be more meaningfully termed imaginary 
than perceptual. An examination of the operational definition of the motiva- 
tional state of hunger revealed that in most of the experiments the important 
condition may not be hours of food deprivation, but the expectancy of the Ss 
as to when they may next receive food. 


American psychology comprises a variety of 
research traditions. Perceptual research of the 
psychophysical as well as the phenomenologi- 
cal approach has been carried on throughout 
this century. Still, it seems fair to state that 
the study of perception did not become cen- 
tral in American psychology until after the 
Second World War. The perceptual research 
which then emerged differed from that of the 
previous traditions in two important respects. 
In the first place, it represented a change in 
interest, the functional aspect being empha- 
sized. The dependence of perceptual processes 
on motivational states and personality factors 
was brought to the fore. Secondly, a methodo- 
logical reorientation took place. Perceptual 
experiences tended to be treated as inferred 
events, the emphasis in the determination of 
the term perception being placed on the set 
of antecedent conditions and the set of re- 
sponses. Reference to perceptual experience 
was avoided as a result of the behavioristic 
conceptions of psychology. The meaning of 
the term perception thus underwent an im- 
portant change. This change in meaning was 
not realized when the problem of the rela- 
tionship of motivational state and personality 
factors to perception was discussed (cf. All- 
port, 1955; Graham, 1951; Murphy 1947, and 
the two symposia on personality and percep- 


1The writer is indebted to his wife, L. Fegersten 
Saugstad, for her suggestions in interpreting the re- 
sults of the experiments, He is also indebted to Leif 
J. Braaten for correcting the English and improving 
the readability of the manuscript. 
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tion edited by Blake & Ramsay, 1951; 
Bruner & Krech, 1950). Gradually the point 
of view seems to have developed that the 
extent to which perceptual processes may be 
said to depend on motivational state and per- 
sonality factors is, to a large extent, a ques- 
tion of definition (cf. Garner, Hake, & Erik- 
sen, 1956; Goldiamond, 1958; Hochberg, 
1956; Prentice, 1956.) Accordingly, the in- 
terest in the problem seems to have declined. 

The writer is in agreement with the psy- 
chologists having criticized the tendency to 
broaden the term perception (Saugstad, 1965). 
Still, he believes that the experimental litera- 
ture based on the broad conception of what 
constitutes perceptual processes reveals some 
trends of great importance for theoretical as 
well as applied psychology. He will base this 
review on the broad term perception-cogni- 
tion, At the end of the discussion, the opera- 
tional definitions of perception used in the 
studies will be commented upon. 

The thesis of this review is that the mote 
meaningful a material is in relation to some 
motivational condition, the greater the proba- 
bility will be that the reactions to the ma- 
terial are affected by this condition. This 
thesis seems to run counter to what appears 
to be a widely accepted assumption, namely 
that perceptual material of an ambiguous 
nature, as far as meaning is concerned, is most 
favorable for demonstrating the effect of some 
motivational condition. 

In experimental research dealing with the 
effect of motivational state on perceptual- 
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cognitive processes, “hunger” is the condition 
most conveniently manipulated. In a total of 
nine studies (Atkinson & McClelland, 1948; 
Brozek, Guetzkow, & Baldwin, 1950; Gilchrist 
k Nesberg, 1952; Lazarus, Yousem, & Aren- 
berg, 1953; Levine, Chein, & Murphy, 1942; 
McClelland & Atkinson, 1948; Postman & 
Crutchfield, 1952; Sanford, 1936, 1937) this 
condition has been the one underlying the 
systematic variable. In one study (Gilchrist 
& Nesberg, 1952) changes were first intro- 
duced for the condition of hunger and then 
for the condition of hunger and thirst. Apart 
from the investigation by Brozek et al., the 
operational definition of hunger in all the 
studies is hours of food deprivation. It will 
be pointed out in the Discussion that more 
important than the condition of hours of food 
deprivation may be the expectancy of the Ss 
as to when they will receive their next meal. 

The experimental literature on bodily needs 
and perception-cognition has previously been 
teviewed by Allport (1955), Jenkin (1957), 
and Solley and Murphy (1960). In another 
place, Saugstad and Schioldborg (1966) have 
reviewed the literature on the related topic of 
value and perception. 


REVIEW OF STUDIES 
Sanford (1937) 


Ina study of an exploratory nature, Sanford 
(1936) had used two sets of pictures and two 
lists of words. To these two tests were added 
a chained association test, a drawing-comple- 
tion test, and a word-completion test. The 
material for the tests was selected from the 
Point of view that it might possibly elicit food 
Tesponses without containing any actual refer- 
ence to food or eating. In responding to the 
pictures in the first test, Ss (college students) 
were asked to interpret what they were seeing. 
If S gave a mere description of a picture, he 
Was questioned until something imaginative 
Was added. 
hg Ss of the experimental group had been 

eprived of food for 24 hours. The Ss of the 
control group were tested during the ordinary 
ating cycle. 
ead and Comments. The Ss who fasted 
i 4 hours gave, on all five tests taken 
Ogether, on the average more food responses 
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than Ss of the control group. This difference is 
statistically significant at the p < .01 level. 
It should also be noted that for all five tests 
the difference is in the same direction. 

The conclusion, therefore, seems to be valid 
that by extensive testing an effect of food 
deprivation on perceptual-cognitive processes 
may be revealed. However, as noted by San- 
ford, when the 24-hour fasters are compared 
to the Ss of the control group tested 3-5 hours 
after meal time, the difference seems to be 
slight. There is thus no steady increase in 
number of food responses with increasing 
hours of food deprivation. 


Levine, Chein, and Murphy (1942) 


These investigators used 40 cards with simple 
pictures drawn in chromatic colors and 40 
cards drawn in achromatic colors as stimulus 
material. All cards were shown behind a 
ground glass screen. According to the in- 
vestigators’ description, there were 15 mean- 
ingless drawings, 15 ambiguous drawings of 
food articles, and 10 drawings of household 
articles in each set of 40 cards. The figures in 
the achromatic cards were reported to lend 
themselves more easily to food responses than 
the chromatic cards because they were more 
easily seen and were more or less round and 
thus resembled fruits. 

The Ss were 10 college students, 5 in an 
experimental and 5 in a control group. The 
Ss of the experimental group were all tested 
at 1, 3, 6, and 9 hours of food deprivation. 
The Ss of the control group were tested after 
lunch time. 

The instruction stated explicitly that S 
should verbalize an association to every pic- 
ture. The Ss of the experimental group were 
all instructed that they would receive a meal 
after the sitting. 

Results and Comments. Pastore (1949) has 
pointed out that the average score of the 
experimental group drops below that of the 
control group if the score of one extreme 1N- 
dividual in the experimental group 1S ex- 
cluded. Obviously results that are so vulner- 
able cannot be relied upon. However, it should 
be noted that the five Ss in the experimental 
group all reveal the same trend. For the 
chromatic cards, there is a rise from 1-hour 
to 3-hours fasting and then a drop to 6-hours 
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and a further drop to 9-hours. For the achro- 
matic cards, there is a rise from 1-hour to 3- 
hours and again to 6-hours and then a drop 
at 9-hours fasting. There is very likely some 
effect due to the different hours of food de- 
privation in the Ss. Still, the hypothesis of 
the writers is not fully confirmed since, for 
both types of cards, there is a drop with in- 
crease in fasting after a certain interval which 
was not expected. To explain this, the writers 
introduced the following ad hoc hypothesis: 
“as the need increases a reality process starts 
counteracting the autistic processes.” If this 
is true, the conclusion obviously must be that 
only needs of a very mild character, as those 
created by 3-hours fasting, can affect the per- 
ceptual-cognitive responses. The dubiousness 
of this ad hoc hypothesis is seen when one 
contemplates the possible physiological ef- 
fects of 3 hours of food deprivation for a 
well-nourished individual of the Western cul- 
ture. One may doubt whether an effect of the 
need state is detectable in an adult individual 
after 3 hours of fasting. 


McClelland and Atkinson (1948) 


These writers used a situation patterned 
after the studies of Perky (1910) where S was 
presented with a blank screen with the in- 
structions that various objects would be pre- 
sented on the screen. There were 12 different 
instructions referring to pictures which the 
experimenters pretended were projected. The 
instructions differed as to the degree of struc- 
turization provided. Three of the instructions 
contained no hints at all to the contents of the 
pictures (unstructured items). Seven instruc- 
tions provided a hint about the content 
(loosely-structured items). Two called for a 
choice among three stated alternatives (well- 
structured items). 

Some of the Ss were run in an additional 
experiment where they were to compare the 
size and number of food-related and non-food- 
related objects which the experimenter pre- 
tended were presented. Furthermore, to one 
group of Ss, smudges or shadows were pre- 
sented instead of blanks, 

The Ss were Navy men with a modal age 
of 18 years, deprived of food for 1, 4, and 
16 hours, respectively. 


Results and Comments. With regard to 
number of food responses elicited, the writers 
reported a significant difference at the p < .0) 
level between the 1-hour and the 16-hour 
group, but no significant difference between 
the 1-hour and the 4-hour group and between 
the 4-hour and 16-hour group. 

Due to the different number of items te 
ferred to as respectively unstructured, loosely 
structured, and well structured, a comparison 
of number of food responses given to different 
types of items is not easily undertaken. Still, 
it should be noted that the two well-struc- 
tured items elicited about 75-80% of the 
total number of food responses, and that 
these two items are responsible for the major 
portion of the differences found among the 
groups. 

The results of the comparison of the size 
of the food objects to the nonfood objects in- 
dicate that with increasing amount of food 
deprivation there is an increasing tendency to 
judge the former objects as the larger. The 
difference between the 1-hour and the 16-hour 
group was highly significant. In contrast, the 
teported number of food objects as against 
teported number of nonfood objects did not 
seem to depend on amount of food depriva- 
tion in a consistent manner. Finally, the 
presentation of smudges was not found to 
elicit more food responses than the presenta- 
tion of the blanks. 


Atkinson and McClelland (1948) 


This investigation was performed on the 
Ss of the previous experiment after they had 
completed the tests described above. 

Eight pictures, each exposed for 20 seconds, 
were used as material for the experiment. 
Five of the pictures were taken from the TAT, 
one from the Maier-Seachore Art Judgment 
Test, and two were specifically made up for 
the experiment, 

The Ss were informed that it was a test of 
their creative imagination and were asked to 
write a story about each picture. To guide Ss 
in giving their stories, they were asked the 
same four questions about each picture, These 
verbal responses to the pictures were classified 


and scored according to a number of different 
criteria. 
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Results and Comments. To one of the pic- 
tures, the verbal reports contained no state- 
ments classifiable as “food imagery.” This 
picture was then removed from the test 
material. No significant statistical difference 
yas found between the three groups with re- 
gard to food imagery. However, when the ma- 
trial was scored according to number of 
times food was needed in a story, the plotted 
scores showed the same tendency as the two 
curves in Levine et al. (1942), and, as we 
shall see later, the same tendency as in the 
results of Lazarus et al. (1953). The differ- 
ence between the scores for the 1-hour and 
the 4-hour group is statistically significant at 
the p < .05 level. A curve of a similar shape 
was also obtained for the scores revealing the 
number of times food was the central theme 
of the stories, but here none of the differences 
are statistically significant. A further analysis 
af the food themes revealed that with increase 
in amount of food deprivation there was an 
increase in the number of times food depriva- 
tion was the central theme of the stories. The 
diference between the 4-hour and 16-hour 
group as well as the difference between the 1- 
hour and the 16-hour group are significant at 
the p < .05 level. However, there was no in- 
crease from the 1-hour to the 4-hour group. 
Likewise, in the curve for the number of 
times an instrumental activity aimed at re- 
moving the source of food deprivation it was 
found there was a rise from the 1-hour group 
to the 4-hour group and from the 4-hour group 
‘0 the 16-hour group. The latter difference is 
statistically significant at the p < .05 level. 
The number of times food was involved in 
what the report named “goal-activity”—eat- 
mg or invitations to eat—showed a decrease 


- ftom the 1-hour to the 4-hour and from the 


‘hour to the 16-hour group. The difference 
etween the first and the last two groups is 
Significant at the p < .01 level. 

Atkinson and McClelland (1948) worked 
Out a procedure for weighting the scores ob- 
tained in the different categories. Due to our 
inadequate knowledge of the mechanisms in- 
ae a procedure for weighting may be of 

bious merit. It might be mentioned, though, 
ie the writers found as a consequence of 

i H weighting that the only picture where 

was present was superior to the rest of 
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the pictures in differentiating between the 1- 
hour and the 16-hour group. This finding is 
clearly valid since the reader, under the sec- 
tion on procedure, is informed that this pic- 
ture alone accounts for nearly half of the 
food stories given. 


Brozek, Guetzkow, and Baldwin (1950) 


Whereas the five previously discussed ex- 
periments all dealt with food deprivation over 
a short period of time, this investigation ex- 
amined the effect of prolonged starvation. 
Thirty-six normal young men, conscientious 
objectors, lived for 24 weeks on a diet amount- 
ing to 1570 calories a day affecting an average 
loss of weight of 25%. This period of semi- 
starvation was preceded by a control period 
of 12 weeks and followed by a rehabilitation 
period of 12 weeks. The psychological effects 
of the starvation on the Ss were described as 
follows: 


Psychologically, depression, narrowing of interests, 
social introversion, preoccupation with thoughts of 
food, decrease in spontaneous activity, physical as 
well as mental, and marked diminution in libido were 
among the principal characteristics of the “semi- 
starvation neurosis” [p. 248]. 


The Ss were subjected to the following 
tests: a free word-association test, a restricted 
word-association test, a first-letter test, the 
Rorschach. and Rosenzweig’s P-F Study. 
Finally, the dreams of Ss were examined from 
time to time. 

Results and Comments. The most striking 
feature of the results appear to be the small 
number of food responses on all the tests. 
There was only one food response on the 
Rosenzweig P-F Study. On the Rorschach, 
there were only 2.4% food responses at the 
end of the starvation period. On the word- 
association test, 5.9% of the responses given 
were related to food or eating. Food dreams 
were very rare. Comparison of the scores for 
the Ss during the control period, on the one 
hand, and during the semistarvation period, 
on the other, revealed no statistically reliable 
difference on any of the tests, nor was there 
a difference with regard to the content of 
the dreams. There were, however, some in- 
teresting findings on the word-association test. 
When food words were directly presented, the 
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response words were more often idiosyncratic 
for Ss under semistarvation than for a control 
group. The latencies were also prolonged for 
the former group of Ss. These differences are 
statistically significant. 


Postman and Crutchfield (1952) 


These authors used a word-completion test. 
After a standardization procedure, three lists 
of 21 skeleton words each were made up. The 
first list of skeleton words was selected from 
words evoking from 0-10% food words, the 
second from words evoking from 21-35% 
food words, and the third list from words 
evoking from 41-80% food words. The Ss 
were interviewed after the testing and clas- 
sified according to three amounts of previous 
food deprivation: 0-1 hour, 2-3 hours, and 
4-6 hours. Before the word-completion test 
was presented to Ss, they were induced to 
different degrees of set by forcing them to 
complete different numbers of skeleton words 
into food words. Five degrees of set were de- 
termined by having the various groups com- 
plete 0, 1, 2, 3, and 5 skeleton words in 
advance of the completion test. Undergradu- 
ate students from psychology classes par- 
ticipated in the experiment. 

Results and Comments. Amount of food 
deprivation did not significantly affect the 
number of food responses emitted. However, 
food deprivation was found to give significant 
interaction effects with set. Also, interaction 
effects were found between food deprivation 
and stimulus list, the effect being greatest 
for the list evoking a medium number of food 
responses. Further interaction effects were 
found between set and stimulus list. 


Gilchrist and Nesberg (1952) 


Whereas the previous investigations in some 
way or other concentrated on the associations 
produced by Ss to various types of stimulus 
material, these writers employed an entirely 
different technique. They used four pictures 
of food objects as stimulus material. The 
pictures were projected for 15 seconds at a 
definite illumination, illumination being de- 
fined as some definite voltage. After the 
presentation of a picture, the light from the 
projector was turned off by a shutter for 10 


seconds, and the voltage was reset at either 
plus or minus 20 volts of the original setting 
The Ss, students of elementary psychology, 
were asked to adjust the illumination to the 
original setting by turning the knob of a 
Variac connected to the lamp. 

The Ss of the control group (comprising 
one-half of the Ss) ate their usual meals 
while Ss of the experimental group were de- 
prived of food for 20 hours. All Ss were 
tested at O hours, 6 hours, and 20 hours. Es- 
sentially the same procedure was used in a 
second experiment, but now Ss in the experi- 
mental group were deprived of both food and 
water and the pictures were of various types 
of drinks. The Ss of the experimental group 
were tested at O, 23, 5, and 74 hours of 
food and water deprivation. After the last 
session, Ss were given as much water, juice, or 
milk to drink as they wanted and were then 
again tested. In third and fourth experiments, 
Ss were presented with slides producing re- 
spectively homogeneous color fields and pic- 
tures of landscapes instead of need-relevant 
objects. The Ss of the experimental group 
were informed in advance that they would 
have to fast for the number of hours involved 
in each type of experiment. 

Results and Comments. While the scores 
for the control group in the first and second 
experiment stayed at about the same level, 
the scores for the experimental group, both 
in Experiments 1 and 2, rose from session to 
session. The difference among the scores of 
the various sessions were significant at the 
$ <.01 level. In Experiment 2, the score 
dropped to the level of the score at O-hours of 
deprivation after Ss had been given the drink. 
The homogeneous colored fields and the land- 
Scapes gave no increase in score with increase 
in time of food and water deprivation. 


Lazarus, Yousem, and Arenberg (1953) 


In these experiments, the task of the Ss, 
male college-student volunteers, was to recog- 
nize or name the object in each of 10 pic- 
tures. Five of the pictures represented food 
objects and five represented nonfood objects. 
The pictures were all Presented in a tachisto- 
scope at the same rate of exposure—2 seconds. 
As in the previously described experiment, the 
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lumination at which the pictures were 
presented was varied by means of a Variac 

omnected to the lamp of a projector. Be- 

soning at a voltage of 22, the voltage was 
icreased for each trial in steps of 1 volt. 

The criterion for recognition of the objects 
mas st at five correct recognitions in succes- 
son. The number of trials before recognition 
was counted, 

Two experiments were performed. In the 
first, Ss were left free to respond to the pic- 
tures. In the second, Ss were restricted in 
their choice of response by being presented 
with a list containing the names of the 10 
objects in the pictures and 6 additional 
dummy items. The Ss were instructed to se- 
lect their responses from the list. 

In the first experiment, Ss were also pre- 
sated with Sanford’s word-association test. 
As in Sanford’s second experiment, Ss were 
interviewed after testing about the time of 
their last meal and were then grouped ac- 
cording to hours since last eating. 

Results and Comments. No difference was 
bund between the different groups on San- 
‘ord’s word-association test. 

As in the experiment by Levine et al. 
(1942), the greatest effect of food deprivation 
on the responses was found at 3—4 hours. The 
score then rose sharply at 5-6 hours where it 
Teached about the same level as the score for 
the O-hour group. The difference between the 
Scores (transformed into logs) of the food- 
deprived and the non-food-deprived Ss was 
Statistically significant at the p < .05 level. 
As has already been mentioned, the form of 
the Curve is similar to the curve found by 
levine et al. and Atkinson and McClelland. 

In the second experiment, where Ss were 
Movided with lists, the relationship between 
e scores at the different periods of fasting 
'evealed no effect of food deprivation on per- 
eption-cognition. 

Ih the first experiment, where the writers 
counted the number of food responses given 
y the Ss before the correct response was 
ven, no correlation was found between num- 
a of these food guesses and amount of food 

privation, As stated by Lazarus et al. 
(083), this finding is in line with the results 
i the word-association test. 


Discussion 
Type of Material 


When the results of the experiments are 
analyzed, it can be seen that certain types of 
material seem consistently not to give an 
effect. The conclusion seems rather safe that 
the word-association test does not give an 
effect. In the studies by Lazarus et al. (1953) 
and by Brozek et al. (1950), where statistical 
tests were applied, no significant differences 
were found. In the two experiments by San- 
ford (1936, 1937),°there is only a slight 
arithmetical difference in favor of the food- 
deprived groups. In line with these negative 
findings are the results of the first-letter test 
of Brozek et al. (1950), and the word-com- 
pletion test of Postman and Crutchfield 
(1952). The lack of an effect in Atkinson and 
McClelland’s (1948) experiment on imagery, 
and the negative results in Brozek et al. 
(1950) on the Rorschach and the Rozen- 
zweig P-F test may also be regarded as being 
in line with these negative findings on the 
word-association test. Apparently, associations 
to some material not directly related to the 
motivational state of hunger will not reveal 
any effect. 

On the other hand, the presentation of 
words having direct reference to food in the 
study of Brozek et al. (1950) revealed an 
effect. It will be remembered that the la- 
tencies for the emittence of the associated 
words as well as the number of idiosyncratic 
words in the associations were affected by the 
state of hunger. The results of the experiment 
by McClelland and Atkinson on judged size 
may be said to be in line with this finding. 
In this experiment, the food objects to be 
compared to the nonfood objects were named 
by the experimenter. It will also be remem- 
bered that the majority of the food responses 
in this experiment were given to the two well- 
structured items containing reference to food 
and that these two items might be responsible 
for the major portion of the difference in 
scores between Ss of the food-deprived and 
the non-food-deprived group. A direct refer- 
ence to food was thus provided. Postman and 
Crutchfield (1952), in their word-completion 
test, obtained an effect between state of 
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hunger and set, but no effect by hunger 
alone. The correct interpretation may be that 
the procedure for establishing the set struc- 
tured the material so that it became relevant 
to the need. In the reactions to the pictures, 
the same tendency is apparent as in the verbal 
material. The effect was unequivocally pro- 
duced in the investigation of Gilchrist and 
Nesberg (1952). These writers used clearly 
presented pictures of food objects. In the 
study by Atkinson and McClelland (1948), 
it was noted that nearly half of the food 
stories were given to the one picture con- 
taining a representation of a food object. On 
the basis of these results we might draw the 
tentative conclusion that the motivational 
state of hunger will only have an effect on 
the responses given to material of a per- 
ceptual-cognitive nature when this material is 
meaningfully related to the motivational state. 

The results in none of the investigations 
appear to run counter to the above state- 
ment. The material used by Lazarus et al. 
(1953) consisted of pictures of food. The 
type of material here may thus be in line with 
the statement. However, since the illumina- 
tion of the pictures presented was gradually 
increased until correct reports were obtained, 
the possibility is still open that the material 
may be characterized as not meaningful with 
regard to the need. In the experiment by 
Levine et al. (1942), the pictures were de- 
scribed as differing with respect to degree of 
clear presentation of food object. The results 
of this experiment may, therefore, also be in 
line with the above statement. In the in- 
vestigation by Sanford (1936, 1937), the 
material presented was not described in such 
detail that the results may be examined with 
respect to the statement, but it should be 
noted that the pictures used, as well as the 
drawings in the drawing-completion test, may 
have differed in degree as to clearness of 
presentation of a food object. 

The evidence supporting the conclusion of 
an effect of motivational state on perception- 
cognition is limited. Besides, concentration 
has been almost exclusively on the state of 
hunger. The effect of other types of motiva- 
tional states may possibly be entirely different. 
Also, with the exception of the study by 
Brozek et al. (1950), “hunger” was induced 


for short periods of time. Motivational states 
having developed over longer periods may 
have different effects. Finally, the effects may 
be different in various types of abnormalities 
However, considering the difficulty of obtain- 
ing evidence on the rationale behind th 
projective techniques, the results of the pres 
ent examination are worth attention. In line 
with these results, one will expect only ma- 
terials which are meaningful in relation to 
some motivational state to produce responses 
revealing the state of motivation. Thus the 
cards in Rorschach, TAT, and similar tests 
should, therefore, only occasionally elicit re 
actions related to some need or conflict. There 
seems to be no strong evidence against this 
point of view. The conceptions about the 
perceptual processes brought to the fore 
during and after the Second World War may 
have exaggerated the dependence of perceptual 
processes on motivational states and person- 
ality factors. The perceptual systems may 
develop relatively independent of other sys 
tems in the organism, the interaction between 
the perceptual systems and motivational 
states being more on an abstract cognitive 
level. The conceptions of Nissen (1951) of 
perceptual development may give a more 
balanced model than that by the participants 
of the two symposia on perception and person- 
ality (Blake & Ramsay, 1951; Bruner & 
Krech, 1950). 

The results of the Postman and Crutchfield 
(1952) study, where motivational state was 
found to interact with set, suggest that 4 
better approach to the study of the relation- 
ship of perceptual-cognitive processes to pet 
sonality factors would be to study serial effects 
as a result of repeated presentations of some 
perceptual-cognitive material suggesting some 
object relevant to the need (Smith, 1963). 

Klein (1954) has reported some experi- 
ments which indicate that sizable individual 
differences manifest themselves in reactions t0 
the type of situations dealt with in this re 
view. Unfortunately, Klein does not seem t0 
have presented his results in a complete re- 
port. The report given does not make cleat 
that his investigations are at all relevant t0 
the problem of the effect of bodily needs on 
perceptual-cognitive processes. Whatever may 
be the correct conclusion resulting from his 
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investigations, it is still necessary to explain 
te different effects of different types of 
material on the perceptual-cognitive processes. 

In all the studies, including also those of 
Brozek et al. (1950), the situation may be 
aid to be lacking in reality since Ss knew 
that they would obtain food. This knowledge 
pssibly may have contributed to a diminution 
in the effect of food deprivation. In a situa- 
tion where a person is left in great uncertainty 
as to whether he will obtain food or not, the 
reactions might be more strongly influenced 
by the state of hunger. However, the opposite 
seems equally possible. As a result of the 
impending danger, reactions might be less 
governed by the motivational state. Un- 
fortunately, it is difficult to obtain informa- 
ion on this problem. 


The Operational Definition of State of Hunger 


Brozek et al. (1950), in their definition of 
state of hunger, could refer to loss in body 
Weight and to periods of fasting so long that 
one has reason to believe that physiological 
conditions had been markedly changed. In 
the other experiments, state of hunger was 

defined in terms of hours of food deprivation. 
The time span was short and the effect of the 
deprivation may have depended on the previ- 
ous nutritional state as well as on a number 
of physiological conditions which may vary 
‘tom individual to individual. Also, the defini- 
tion of state of hunger in terms of hours 
af food deprivation meets with the following 
culty. 
In Western culture, a rather strict daily 
tine is usually followed with regard to 
meal hours. An expectancy for obtaining food 
ad a preoccupation with food may, there- 
fore, be present in the Ss when meal times are 
‘ptoaching. A set for food may thus be 
‘tivated to different degrees at different 
Outs of the day relatively independent of 

Physiological condition. For this reason, 
è results of a number of investigations may 

Subject to an interpretation different 
i m that based on amount of food dep- 
Nation. The important condition may not 
x hours since last feeding, but ours left to 
Ri: feeding. This may explain the effect 
ond by Sanford (1937), and also why he 
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found no difference between the 4-5-hour 
group and 24-hour group. Similarly, it may 
explain the difference in McClelland and At- 
kinson’s (1948) experiment between the 1- 
hour group and 16-hour group, and the 
absence of a difference between the 4-hour 
group and the 16-hour group. It may also 
explain the rise and fall revealed in the 
curves obtained by Levine et al. (1942), by 
Atkinson and McClelland (1948), and by 
Lazarus et al. (1953). 

As will be remembered, in the stories re- 
ported by their Ss, Atkinson & McClelland 
(1948) found an increase of instrumental 
activity, activity aiming at removing the 
source of food deprivation. At the same time 
they found a decrease in goal activity— 
eating, or invitations to eat. These findings 
may be interpreted as a result of some set 
introduced by some uncontrolled condition in 
the experiment. The Ss deprived of food 
would most likely contemplate this state of 
affairs, more so since they do not appear to 
have been informed that they would not be 
served the ordinary meals. The rise in the 
curve for instrumental activity may thus 
simply be a reflection of a preoccupation 
with food deprivation. The decrease in goal 
activity may be seen as a necessary counter- 
part of this preoccupation with food depriva- 
tion. 

In the results of Gilchrist and Nesberg 
(1952), there is a steady increase in scores 
over the whole period of 20 hours. This may 
be due to the fact that Ss did not expect food 
before these hours had elapsed. As will be 
remembered, Ss were informed that they 
would have to go without food for the period 
of time required for the carrying out of the 
experiment. However, it is also possible that 
different mechanisms are operating in this 
situation than in the other ones. 


The Operational Definitions o f Perception 


In a book on the logical foundations of 
experimental psychology, Saugstad (1965) 
has argued that a psychological concept must 
have a threefold reference: (a) to a set of 
antecedent conditions, (b) to a set of re- 
sponses, and (c) to a set of conscious ele- 
ments or experiences. Under the influence of 
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behaviorism, the third type of reference has 
been neglected—the result being that the 
reference to the events studied have become 
vague and illusive. 

On the basis of this threefold reference, 
it seems reasonable to maintain the distinc- 
tion gained from everyday language between 
experiences referring to perception and ex- 
periences referring to imagination. In a num- 
ber of instances, the experiences related to a 
stimulation present at some definite point 
in time and the experiences which are not 
related to some definite stimulation present 
are clearly distinguishable. Thus, the ex- 
perience of a color when light with definite 
characteristics is stimulating the eye is dis- 
tinguishable from the experience when the 
person is asked to imagine some color with 
his eyes closed. To the writer, this distinc- 
tion appears to be one of the most funda- 
mental that can be made in psychology. 

Also, it seems possible to distinguish be- 
tween the experience considered a description 
of the effect of some definite stimulation and 
the experience considered the result of an 
association with the former experience. Thus, 
the experience of color considered the effect 
of the stimulation by light is distinguishable 
from the experience considered an association 
to the experience of color. The latter type of 
experience seems to be more in line with the 
experiences subsumed under imagination. 
The distinction is probably not as clear-cut 
in the latter case as in the former where 
reference is made to presence or absence of 
stimulation. 

With regard to the two above distinctions, 
it should be noted that they are not identical 
to the one introduced by the older research 
workers in perception, such as von Helmholtz 
(1911), who maintained that stimulation of 
a sensory system resulted in two different 
processes: (@) sensory processes, and (b) 
imaginary processes. This distinction cannot, 
at least at present, be maintained on an 
operational basis. 

In the following paragraphs, an examina- 
tion of the operational definitions in the 
experiments reviewed will be undertaken 
from the point of view of whether or not they 
should be considered as belonging under per- 
ception or imagination. 


In the experiment by Gilchrist and Nesher 
(1952), Ss performed their adjustments from 
memory. The results, therefore, may reflect a 
tendency of Ss to accentuate various remen- 
bered characteristics in the food pictures 
while they made their adjustments. There is 
no basis for a conclusion that the stimulation 
resulted in different processes in the food- 
deprived as against the non-food-deprived 
while it was present. In this connection, men- 
tion should also be made of the fact that this 
study rests on a criterion for identity of ib 
lumination, which is a loose one. The pictures 
presented contained color patches of varying 
degrees of brightness. The question arises as 
to which of these patches did the Ss make 
their adjustments? The Ss may have changed 
their criterion from presentation to adjustment 
by concentrating on different patches in the 
pictures on the two occasions. If there should 
be such a thing as overall brightness in 4 
picture, the impression of overall brightness 
would probably give a loose criterion. Also, 
in the investigation by McClelland and At 
kinson (1948), the experiences of the Ss could 
not be related to the presence of any definite 
stimulation. As will be remembered, thé 
stimulation was a pretended one. 

Sanford (1936, 1937), as well as Levine 
et al. (1942), instructed their Ss to associate 
to the pictures presented. Atkinson and Mc 
Clelland (1948) told their Ss to make up 
stories about the pictures. Postman and 
Crutchfield (1952) had their Ss construct 
words by adding letters to some presented 
assembly of letters. A similar task may be 
said to be involved in the drawing-completio! 
tests by Sanford. Brozek et al. (1951) had 
their Ss associate to food words in the test 
which gave significant differences between the 
food-deprived and the non-food-deprivel 
group. All investigations so far examined 
may most reasonably be said to deal with 
imagination. The only experiment which 
might possibly be classified under perception 
is the one by Lazarus et al. (1953), This 
deserves a closer examination. 

Lazarus et al. presented pictures of food 
objects, beginning presentation at an illumina 
tion below recognition threshold. The question 
arises as to to what extent Ss described the 
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dets of the illumination present. Most 
Ikely the serial presentation of the pictures 
sulted in a number of associations. The 
motivational condition most likely affected 
the choice of association to be reported. The 
stvation may thus be similar to the one 
where McClelland and Atkinson gave their 
sa choice between naming some definite 
food object and two nonfood objects. This 
interpretation is strongly supported by the 
fact that the effect disappeared when Ss were 
provided with a list of response words. The 
fact that no correlation was found between 
food guesses in the prerecognition period and 
amount of food deprivation does not neces- 
sarily contradict this interpretation. The pres- 
ent author has previously pointed out that ma- 
terial not meaningfully related to the motiva- 
tional state probably does not reveal an effect. 
At the lower levels of illumination the pictures 
would probably be poorly structured, and, 
sa result, no effect might be expected from 
motivational state. In conclusion, it might be 
sated that all experiments here reviewed 

| most likely deal with processes more reason- 
aly subsumed under imagination than 
under perception. 

In this discussion the author has arrived 
at two conclusions which may help to specify 
the conditions under which cognitive processes 
may be influenced by some motivational state. 
In the first place, the material presented must 
be meaningfully related to the motivational 
State. This seems to imply that the psycholo- 
gist wanting to diagnose some motivational 
Slate must have specified this state in ad- 
vate and have constructed the material in 
‘dation to this state. In respect to the motiva- 
tional state, the material must be meaningful 
e not ambiguous. This requirement raises 

e difficulty of specifying an inventory of 
motivational states. Secondly, the material 
must be presented to S in a task which allows 
i freedom to react to the material without 
> pora to describe it in an accurate 
BS S sense, the task must be ambigu- 
a is latter requirement raises the diffi- 

Y of specifying the processes which govern 
responses of S. In leaving S freedom, the 
eet may easily, perhaps necessarily, 

x imself in a position where he is unable 

Specify these processes. 


REFERENCES 


ALLPORT, F. L. Theories of perception and the con- 
cept of structure. New York: Wiley, 1955. 

ATKINSON, J. W., & McCuettann, D. C. The 
projective expression of needs. II. The effect of 
different intensities of the hunger drive on thematic 
apperception. Journal of Experimental Psychology, 
1948, 38, 643-658. 

Brake, R. R., & Ramsay, G. V. (Eds.) Perception— 
An approach to personality. New York: Ronald 
Press, 1951. 

Brozex, J., GuerzKow, H., & BaLowIN, MARCELLA V. 
A quantitative study of perception and association 
in experimental semistarvation. Journal of Per- 
sonality, 1950-51, 19, 245-264. 

BRUNER, J. S., & Krecu, D. (Eds.) Perception and 
personality: A symposium. Durham: Duke Univer. 
Press, 1950. 

Garner, W. R, Haxe, H. W., & Errksen, C. W. 
Operationism and the concept of perception. Psy- 
chological Review, 1956, 63, 149-159. 

GıcarisT, J. C., & Nesserc, L. S. Need and per- 
ceptual change in need-related objects. Journal 
of Experimental Psychology, 1952, 44, 369-376. 

Goxpramonp, I. Indicators of perception: I. Sub- 
liminal perception, subception, unconscious per- 
ception: An analysis in terms of psychophysical 
indicator methodology. Psychological Bulletin, 
1955, 55, 373-411. 

Gramam, C. H. Visual perception. In S. S. Stevens 
(Ed.), Handbook of experimental psychology. 
New York: Wiley, 1951. Pp. 868-920. 

Hermuortz, H. von. Handbuch der physiologischen 
Optik, III. 3. Aufl. Leipzig: Voss, 1911. 

Hocuperc, J. Perception: Toward the recovery of a 
definition. Psychological Review, 1956, 63, 400-405. 

Jenkin, N. Affective processes in perception. Psy- 
chological Bulletin, 1957, 54, 100-127. 

Krein, G. S. Need and regulation. In M. R. Jones 
(Ed.), Nebraska symposium on motivation: 1954. 
Lincoln: Univer. Nebraska Press, 1954. Pp. 224- 
274. 

Lazarus, R. S., YOUSEM, 
Hunger and perception. 
1953, 21, 312-328. 

Levine, R, CHEN, I, & Murray, G. The relation of 
intensity of a need to the amount of perceptual 
distortion: A preliminary report. Journal of Psy- 
chology, 1942, 13, 283-293. 

MoCumizann, D. C., & ATKINSON, J. W- The pro- 
jective expression of needs: I. The effect of dif- 
ferent intensitites of the hunger drive on per- 
ception. Journal of Psychology, 1948, 25, 205-222. 

Murray, G. Personality: A biosocial approach to 
origins and structure. New York: Harper, 1947. 

Nissen, H. W. Phylogenetic comparison. In S. S. 
Stevens (Ed.), Handbook of experimental psy- 
chology. New York: Wiley, 1951. Pp. 347-386. 

Pastore, N. Need as a determinant of perception. 
Journal of Psychology, 1949, 28, 457-475. 

Perxy, C. W. An experimental study of imagination. 
‘American Journal of Psychology, 1910, 21, 422- 


452. 


H. & ARrENBERG, D. 
Journal of Personality, 


90 PER SAUGSTAD 


Posrman, L, & Crutcurrmtp, R. C. The inter- 
action of need, set and stimulus structure in a 
cognitive task. American Journal of Psychology, 
1952, 65, 196-217. 

W. C. H. “Functionalism” in perception. 
Psychological Review, 1956, 63, 29-38. 

Sawrorp, R. N. The effects of abstinence from food 
upon imaginal processes: A preliminary experiment. 
Journal of Psychology, 1936, 2, 129-136. 

Sanvorp, R. N. The effects of abstinence from food 
upon imaginal processes: A further experiment. 
Journal of Psychology, 1937, 3, 145-159. 


Savocstap, P. An inquiry into the foundations sj 
psychology. London: Allan & Unwin, 1965. 

Savestap, P., & ScuronpsorG, P. Value and size per- 
ception. Scandanavian Journal of Psychology, 1%. 
7, in press. 

Ssarn, G. Process—A biological frame of ref 
for the study of behavior. Scandinavian Ji 
of Psychology, 1963, 4, 44-54. 

Sorex, C., & Murray, G. Development of th 
perceptual world. New York: Basic Books, 1960, 


(Received January 29, 1965) 


ms, ug 65, No. 2, 91-98 


RELATIONSHIP BETWEEN EEG AND TEST INTELLIGENCE: 
A COMMENTARY * 


ROBERT J. ELLINGSON 


Nebraska Psychiatric Institute, University of Nebraska College of Medicine 


In a recent review Vogel and Broverman concluded that, contrary to previously 
expressed opinions, there do appear to be relationships between EEG phenomena 
and IQ—at least among children, the retarded, and institutionalized geriatric 
and brain-damaged patients. The evidence for such relationships is reexamined. 
The following conclusions are drawn: (a) The evidence concerning relation- 
ships between normal brain-wave phenomena and IQ in children and in the 
mentally retarded is contradictory and inconclusive. (b) The weight of avail- 
able evidence suggests that there is no relationship in normal adults. (c) EEG 
abnormality and decreased intellectual capacity are both effects of organic 
brain disorders, and hence tend to be related to one another. 


Ina recent issue of this journal Vogel and 
Broverman (1964) reviewed literature per- 
taining to the relationship between brain-wave 
phenomena and intelligence. They drew the 
following conclusions: 


the bulk of the studies with feebleminded subjects, 
children, institutionalized geriatric subjects, and 
brain-injured adults have reported significant EEG- 
test intelligence relationships. However, with the 
ception of Mundy-Castle (1958) and Mundy- 
Castle and Nelson (1960), the investigators who 
have studied normal adults have not found sig- 
niñcant relationships between test intelligence and 
EEG tracings. It appears, then, that relationships 
between EEG and test intelligence are most evident 
a subjects who have either relatively undeveloped 
function (ie, children; feebleminded subjects) or 
‘eteriorated intellectual function (brain-damaged and 
iistitutionalized geriatric subjects) [p. 139]. 


These conclusions are similar in part to those 
of Netchine (1959), and ostensibly dissimilar 
to those of Lindsley (1944), Ostow (1950), 
and Ellingson (1956), and, more recently, of 
Hill (1963), 

This commentary is offered because the evi- 
dence ig by no means as convincing as Vogel 
and Broverman make it out to be. 

i To begin with, confusion results from failure 
0 distinguish between EEG abnormality and 
normal brain-wave phenomena. There is un- 
eee a significantly high rate of EEG 
‘normality (excessive slow activity, etc.) 
mong patients with organic brain disorders 
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(including the more severely mentally re- 
tarded, most of whom are brain damaged). 
Patients with organic brain disorders tend to 
display intellectual deficits. Hence, many pa- 
tients with EEG abnormalities tend to display 
intellectual deficits. These are clinical facts 
which have been accepted for many years, 
and are certainly not in dispute. 

Whether EEG abnormalities occur in pa- 
tients with organic brain disorders, another 
manifestation of which may be intellectual 
deficit, and whether and how normally oc- 
curring EEG phenomena are related to “nor- 
mal” variations in intellectual performance, 
are essentially separate questions. This dis- 
tinction was not evident in Vogel and Brover- 
man’s exposition, but perhaps they do not 
agree that the distinction is a useful one. 
However, it was with the question of the 
relationship between the normal physiological 
activity of the brain (as partly reflected in 
the EEG) and intellectual performance, that 
Lindsley, Ostow, and the author were con- 
cerned when we stated our essentially nega- 
tive conclusions about relationships between 
prain-wave activity and intelligence. It is with 
this question that this commentary will mostly 


be concerned. 


METHODOLOGY 


Vogel and Broverman dealt with several 
methodological topics: measurement of intel- 

2Such patients are, by definition, not mental 
retardates if the onset of the organic brain disease 
occurred after early childhood. 
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ligence, placement of EEG leads, conditions 
of EEG recording, sex differences, and EEG 
indices. The author largely subscribes to 
what they wrote, but with serious reservations 
about their inference (1964, p. 140) that 
children, the retarded, and brain-damaged 
or impaired geriatric patients should perform 
similarly (i.e., show a similar lack of dif- 
ferentiation of cognitive processes) across a 
wide range of psychological tests. Also, the 
evidence for sex as a significant variable in 
EEG studies (pp. 141-142) is unconvincing 
and has not been generally confirmed. 

There are a number of additional methodo- 
logical considerations, mostly concerning EEG 
techniques, which they did not discuss. A few 
of these will be set forth briefly in this 
section. Their relevance will be evident in 
the following section. 

Definition 

First is the matter of definition, or the 
criteria by which a specific EEG phe- 
nomenon is identified. Criteria of the vari- 
ous EEG phenomena varied a good deal in the 
early years of electroencephalography. Many 
of the studies which Vogel and Broverman 
cited were reported during that period prior to 
1940, and thus there is some ambiguity among 
them concerning the precise phenomena being 
examined. (The current most widely accepted 
definitions of EEG phenomenon will be found 
in Brazier et al., 1961). 


Derived Indices 


Many derived indices (arithmetic deriva- 
tives of two or more parameters of EEG 
Phenomena) have been developed over the 
years. It is clear that with all the observations 
and measurements that one can make on 
EEG tracings, an almost unlimited number 
of derived indices can be devised. It becomes 
necessary, therefore, to exercise selectivity in 
choosing among the possibilities. The po- 
tential theoretical Significance of a measure 
would seem to be the most rational basis for 
choosing it. For example, identification of the 
alpha rhythm with an excitability cycle, 
gating the flow of information to and from 
the forebrain, provides grounds for studying 
alpha frequency in connection with informa- 
tion-handling processes. Such a rationale for 


the development of derived indices has oft 
been lacking. 


Reliability 

The reliability of clinical EEG interprets 
tion has been little studied (Houfek & Elling 
son, 1959). The reliability of quantifiel 
measures of EEG phenomena, such as alpha 
frequency and alpha index, has been some 
what more often investigated. However, 
very few reports in the literature on ERG 
and intelligence even mention reliability; 
notable exceptions are the Johannesburg 
papers (Biesheuvel & Pitt, 1956; Mundy- 
Castle, 1958; Mundy-Castle & Nelson, 1960). 

The author concedes that an experienced 
electrophysiologist can reliably identify the 
alpha rhythm and reliably measure its fre 
quency, and can also determine alpha index 
and amplitude reliably for a given recording, 
The stability of the latter two variables under 
varying conditions is, however, always opet 
to question (and indeed may itself be al 
object of study), and test-retest reliability 
should be demonstrated for these variables 
It goes without saying that reliability of 
newly devised indices should be established: 
unfortunately this precaution has been uni- 
versally ignored in the literature unde 
examination. 


Controls 


Normal control groups have been employed 
when appropriate in all studies of EEG and 
intelligence except one (Rahm & Williams, 
1938), in which the brain electrical activity of 
the patients was judged against general stand- 
ards of abnormality. In several studies, the 
control groups were somewhat less than ade 
quate in terms of numbers and/or ages of the 
subjects. Blind clinical EEG interpretations 
and measurements of alpha frequency an 
other phenomena are mentioned as controls in 
only a few studies of EEG and intelligence 
(Corriol & Cain, 1949; Wolfensberger & 
O’Connor, 1965). Few investigators of EEG 
and intelligence have controlled for EEG 
abnormality, that is, separated subjects with 
abnormal from those with normal EEGs befote 
making measurements of alpha frequency, 
index, etc. In a number of studies, insufficient 
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| EEG data were available upon which to base 
such a judgment in any case. 

The necessity for knowing as thoroughly as 
possible the nature of the group subjected 
to study is as important as the need for con- 
trolling for EEG abnormality. This is par- 
ticularly important when using retarded sub- 
jects, since mental retardation is a symptom 
with a wide variety of causes the EEG cor- 
relates of which vary enormously, as do EEG 
variations related to associated symptoms, 
such as seizures. To attempt to use EEG 
data gathered on samples of undifferentiated 
retarded subjects, as has been done in a num- 
ber of studies, is almost as pointless in this 
type of study as indiscriminately selecting 
all the patients in a general hospital. 


Tue Eviwence REVISITED ê 


Relationships between EEG and Test Intel- 
ligence in Mental Defectives 


Vogel and Broverman (1964) concluded 
that “The weight of evidence, then, supports 
the Proposition that a significant relationship 
exists between intelligence and various aspects 
of EEG functioning (especially occipital alpha 
frequency) among feebleminded persons, par- 
ticularly at the lower ranges of feebleminded- 
ness [p. 135].” Let us reexamine the evidence. 

In one of his classical papers, Berger (1933) 
described his observations on three moderately 
retarded subjects (Imbezillen) and four 
Severely retarded subjects (Idioten) of vari- 
als ages. No intelligence test scores were 
Sven. He reported that the tracings of these 
subjects were of generally lower voltage than 

ose of normals and that the duration 
(period) of the alpha waves was longer (up 
to 250 msec) than in normals (average 100 
msec). He was fully aware of the effects of 

Uctuation of attention on alpha amplitude 
be employed various controls, including 

jPeated recordings, and he satisfied himself 

at the voltage differences were not due to 


attentional factors. In view of the small num- 
ing and Broverman’s order of presentation is 
in i 4s closely as possible. It would be helpful 
ce erstanding the following discussion to have 
Pe available for reference, since it is not 
„ein the space available to repeat their presen- 


tation in detail 


ber of subjects (W) and the fact that only 
single tracings were obtained, these results 
can be taken as no more than suggestive. The 
one illustration of a tracing from a 26-year- 
old idiot looks suspiciously abnormal. 

The series of studies by Kreezer and Smith 
(Kreezer, 1936, 1937, 1938, 1939, 1940; 
Kreezer & Smith, 1937, 1950) can be dis- 
cussed as a group since the procedure was 
apparently the same in all and the data re- 
ported in the various papers overlap to some 
extent. The data were all collected prior to 
1940 (the 1950 article, which Vogel and 
Broverman apparently considered a later 
replication, was actually based on data col- 
lected in 1936). The procedures used were 
described in great detail in the 1939 paper. 
Sequences of five or more waves in the fre- 
quency range of 7—14/sec were considered 
alpha. Average amplitudes were based on the 
measurement of 80 waves per tracing. Separate 
recordings of about 60 seconds duration were 
made from right occipital, motor area, and 
frontal electrodes linked with an earlobe 
reference electrode. Such records are inade- 
quate for making judgments about EEG 
normality or abnormality, and Kreezer and 
Smith did not attempt to do so. They are not 
to be criticized for this limited procedure, be- 
cause at that early time they had only a one- 
channel amplifier with an expensive photo- 
graphic recording system. The point is that, 
however good the reasons, there was no con- 
trol for EEG abnormality. Such tracings, 
recorded in a completely dark room with ap- 
parently no control for eyelid position, also 
provide no possibility of determining if the 
prevalence of alpha was typical for a subject 
under standard conditions (awake, relaxed, 
with eyes closed). Data were gathered on 
various groups of monogoloids, undifferenti- 
ated “familial” and “hereditary” retardates, 
and phenylketonurics. A group of 22 controls, 
3 months of age to adult, was mentioned in 
one article (Kreezer, 1936), and apparently 
the same or other controls were used in the 
other studies. 

Excluding the two abstracts (Kreezer, 
1937; Kreezer & Smith, 1937) which were 
preliminary reports containing no specific 
data, and allowing as best one can for over- 
lapping data, Kreezer and Smith overall re- 
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ported five failures and no successes in at- 
tempts to relate mental age (MA) and alpha 
frequency; one success, one partial success, 
and two failures to relate MA and alpha 
amplitude; two successes, one partial success 
(almost surely due to the age factor), and 
one failure to relate MA and alpha index; 
and three failures to relate MA and delta 
index (delta being defined as activity of less 
than 8/sec frequency). The only support the 
author can find for Vogel and Broverman’s 
statement that “A positive relationship be- 
tween alpha frequency and MA was also found 
in familial defectives [p. 133],” is Kreezer’s 
1937 abstract, based on an unspecified num- 
ber of adult retardates of various types, in 
which Kreezer stated that “Subjects of higher 
MA tend to differ from those of lower MA 
in... a greater frequency of waves in 
regular rhythms.” This finding was not con- 
firmed in any of the more detailed reports. 
Vogel and Broverman have apparently ac- 
cepted the “later” correlation of .323 (p< 
.06) as significant (Kreezer & Smith, 1950). 
Do they then also accept as significant the 
unusual correlation of —.322 (also p < .06) 
between chronological age (CA) and alpha 
` frequency obtained on the same subjects? 
The author prefers to escape this dilemma by 
adherence to the p < .05 level of confidence. 

If the author has read Vogel and Broverman 
(1964, p. 133) correctly, their information 
about what they consider later confirming 
studies of Bernhard and Skoglund (1939), 
Gunnarson (1945), and Novikova (1956) was 
derived from Netchine (1959). Bernhard and 
Skoglund reported the results of studying a 
group of 16 “not at all homogeneous” 6-18- 
year-old retardates; their finding of half with 
alpha frequencies below the normal range 
cannot be taken as more than suggestive in 
view of the small NV and lack of control for 
EEG abnormality. Gunnarson’s (1945) study 
suffered from so many methodological limita- 
tions as not to warrant discussion. Novikova 
(1956) presented no specific alpha frequency 
data. Her findings indicated that retardates’ 
EEGs show poorer alpha rhythms and more 
slow activity than the EEGs of normal chil- 
dren of the same ages, and that the more 
severe the retardation, the greater the EEG 
abnormality. 


Vogel and Broverman cited three studies 
as not supporting a relationship between 
mental retardation and alpha index (sic). 
These provide no more convincing negative 
evidence than the above studies do positive 
evidence. The study of Rahm and Williams 
(1938) suffers from a number of the same 
methodological deficiencies as several other 
early studies. Corriol and Cain’s (1949) study 
involved 36 5—15-year-old heterogeneous boys 
with character disorders, of whom 16 were 
retarded. Clinical interpretations were done 
blind, but no quantified alpha measures were 
reported. The only pertinent finding was that 
all but two subjects displayed alpha rhythm. 
Lindsley’s (1938) report involved only four 
retarded subjects. 

Vogel and Broverman also cited a study of 
1,118 retardates by Gibbs, Rich, Fois, and 
Gibbs (1960), which showed a higher in- 
cidence of EEG abnormality among the more 
severely retarded. They did not mention that 
Gibbs et al. also found no difference in alpha 
distribution between normal adults and uni- 
differentiated adult retardates, but they did 
observe a slightly higher incidence of low 
voltage waking records among retardates at 
all ages. 

Berkson, Hermelin, and O’Connor (1961), 
a study not cited by Vogel and Brovermat, 
found no differences among controls and three 
groups of young-adult retardates with respect 
to alpha frequency or alpha index, despite 
the fact that the mean IQs of two of the 
retardate groups were 33.5 (“imbeciles”) 
and 30.6 (mongoloids). Lindsley and Berk 
son, in an unpublished study cited by Berkson 
(1963, p. 565) in another context, found n 
alpha frequency differences between 13 young 
adult subjects of superior intelligence and 1? 
young-adult retarded subjects with normal 
EEGs. 

The studies of Netchine and his colleagues 
(Netchine & Lairy, 1960; Netchine & Net 
chine, 1962; Netchine, Talan, Lairy, & Zaz 
1959) can be discussed as a group. These 
workers are to be commended for their it- 
genuity in devising complex indices of brain 
electrical activity, but it could be wished thi! 
they had seen fit to provide data on the rel 
ability of the indices. Netchine et al. (1959) 
reported significant correlations in 30 14-2 
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year-olds between Binet-Simon MA and alpha 
index, alpha amplitude, magnitude of pos- 
teroanterior voltage gradient, and incidence 
of their “Type B” EEG record. However, as 
Vogel and Broverman have pointed out, there 
was no mention of control for CA in the report. 
That this omission was significant is sug- 
gested by the fact that when alpha index was 
correlated with WISC IQ the correlations 
tended to disappear. The author can find no 
information to support Vogel and Broverman’s 
statement that Netchine et al. “demonstrated 
that . . . occipital alpha frequency provided 
lower correlations with MA than did measures 
of alpha frequency which were obtained from 
the parietal, rolandic, or frontal leads [p. 
134].” Netchine et al. presented no alpha fre- 
quency data as such. 

Netchine and Lairy (1960) studied 209 
5-12-year olds with a wide range of IQs, 
controlled for age. They reported a variety of 
results, the statistical significance of most of 
which is unclear. The author will comment 
only on Vogel and Broverman’s (1964) state- 
ment: 


Both positive and negative significant relationships 
between alpha frequency and intelligence have been 
reported by Netchine and Lairy (1960) in a sample 
of 209 children who . . . ranged in intelligence from 
defective through superior. The correlation between 
alpha frequency and intelligence at age 5-6 years 
was —34; at age 7-8 years, .20; at age 9-10, .30; 
at age 11-12, .38 (all correlations, except that at 
age 7-8 years, are statistically significant) [p. 135]. 


The facts are these: (a) The data were not 
based on alpha frequency but on occipital 
frequency, which is not necessarily the same 
thing. (b) The correlations cited (Spearman’s 
p) were based on only the 67 subjects with 
1Qs over 100. On the basis of the entire popu- 
lation of 209, higher occipital frequencies 
were found for subjects with higher IQs than 
lower IQs at all age levels (significance un- 
known). (c) Netchine and Lairy did not com- 
Ment on the statistical significance of the 
above coefficients. The Ns listed by them for 
Subjects with IQs over 100 in the four age 
Stoups were 7, 16, 25, and 19, respectively. 
None of the coefficients is significant. 

Vogel and Broverman (1964) stated that 
Netchine and Netchine (1962), “employing 
another EEG index based upon alpha rhythm 


and an amplitude measure, again successfully 
differentiated high- and low-grade feeble- 
minded subjects [p. 134].” The two groups 
the Netchines used in the study were in fact 
matched for MA and IQ, but differed in terms 
of performance on perceptual-motor tests. The 
two groups were not significantly different 
with respect to alpha frequency or postero- 
anterior voltage gradient. 

In conclusion, the author does not feel that 
the weight of the above evidence indicates 
significant EEG-test intelligence relationships 
among the retarded. 


Relationships between EEG and Test Intelli- 
gence in Children 


In three age-controlled studies (Henry, 
1944; Knott, Friedman, & Bardsley, 1942; 
Netchine & Lairy, 1960), 4 of the 14 com- 
puted correlations between IQ and alpha fre- 
quency were statistically significant. Three 
coefficients were negative and the remainder 
positive, the most impressive being .50 (p < 
.01) reported on 48 8-year-olds by Knott et 
al., who also obtained an insignificant correla- 
tion of .12 on 42 12-year-olds. 

A positive relationship between IQ and 
alpha index was reported in one study 
(Netchine & Lairy, 1960), and no relation- 
ship was reported in two others (Knott et al., 
1942; Lindsley, 1938). Henry reported three 
significant negative correlations between Stan- 
ford-Binet IQ and delta index out of a total 
of eight correlations computed. Netchine and 
Lairy (1960) reported higher theta (4—8/sec 
activity) indices in the occipital area in their 
low IQ groups; presumably the differences 
were not significant elsewhere. 

Henry’s (1944) comment would appear to 
be pertinent: “In view of the disconcerting 
manner in which such correlations shift about 
at different ages, discretion prompts the more 
conservative conclusion that there is no dem- 
onstrated relationship [pp. 42-43].” 


Relationship between EEG and Test Intelli- 
gence in Normal Adults 


Only one laboratory (Mundy-Castle, 1958; 
Mundy-Castle & Nelson, 1960) has reported 
a positive relationship between Wechsler- 
Bellevue test intelligence and alpha frequency. 
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Sugarman (1961) obtained a significant nega- 
tive correlation between scores on the New 
South African Group Intelligence Test and 
alpha frequency for a group of 35 university 
students, but not for a group of 15 research 
staff members, Shagass (1946), Biesheuvel 
and Pitt (1955), and Gastaut (1960) all 
failed to demonstrate significant relationships. 
Shagass’ correlation of —.18 between Royal 
Canadian Air Force Classification Test scores 
and alpha frequency is particularly worth 
noting in view of the size of his sample (V = 
1,100). No relationships have been found be- 
tween intelligence and alpha index or alpha 
amplitude (Gastaut, 1960; Mundy-Castle, 
1958; Sugarman, 1961). 

The remainder of Vogel and Broverman’s 
review of the evidence pertains to geriatric 
and brain-damaged patients and essentially 
illustrates the facts, which the author has 
already pointed out, that patients with or- 
ganic brain disorders tend to have abnormal 
EEGs and tend to display decreased intellec- 
tual capacity. There is no need to belabor 
these accepted facts further. 


Discussion 


The author offers the following conclusions 
as alternatives to those of Vogel and Brover- 
man as well as to his own earlier conclusion 
(Ellingson, 1956, p. 18): (a) The evidence 
concerning relationships between normal brain- 
wave phenomena and intelligence in chil- 
dren and in the mentally retarded is con- 
tradictory and inconclusive. (b) The weight 
of available evidence suggests that there is no 
relationship in normal adults. (c) EEG ab- 
normality and decreased intellectual capacity 
are both effects of organic brain disorders, 
and hence tend to be related to one another. 

Despite increasing methodological sophisti- 
cation, the author must confess to a continu- 
ing pessimism about finding significant and 
important relationships between EEG phe- 
nomena and complex behavioral processes for 
reasons previously stated (cited by Vogel and 
Broverman, 1964, p. 132), and perhaps more 
subtly stated by Gastaut (1960): 


it appears that the primitive bioelectric function of 
the brain, simple and similar in different stages of 
phylogenesis, cannot be connected to immediate 


modes of apprehension of an entity as complex ani 
phylogenetically recent as the human personality 
This would explain the negativity of our results ant 
those of our forerunners, who in 25 years of efor, 
have accumulated a multitude of positive factors, 
but so utterly contradictory that their algebraic sun 
is practically nil.... Far from abandoning th 
correlations between EEG and personality, ow 
future task must therefore be the transformation of 
information obtained from each one of them, it 
order to render them comparable [p. 227]. 


If relationships between complex behavior 
and brain electrical activity are to be found, 
it is more likely that they will be found by 
recording brain electrical activity during SR 
sequences, than during rest and relaxation 
(Knott, 1940; Vogel & Broverman, 1964). 
Berkson and Lindsley (cited by Berkson, 
1963) and Wolfensberger and O'Connor 
(1965) have compared the latencies and dura- 
tions of alpha blocking responses of retardates 
and controls. Wolfensberger and O'Connor 
found that the latency of alpha blocking was 
significantly greater in a group of retardates 
than in a group of normals. Berkson and 
Lindsley found no differences in latency of 
alpha blocking, but did find that duration of 
alpha blocking was greater for their normal 
group. The latter finding was replicated by 
Baumeister, Spain, and Ellis (1963) and 
Berkson, Hermelin, and O’Connor (1961), 
but Wolfensberger and O’Connor failed to 
find such a difference, Hermelin and Venables 
(1964) found no differences in motor reaction 
times when stimuli were presented during 
alpha blocking as compared with stimuli pre- 
sented while alpha was present for either 
normals or a group of severe retardates, Othet 
possible approaches include studies of brain 
electrical responsiveness to repetitive stimuli 
of varying frequency (Zislina, 1956), and 
Studies of measurements of the latencies, 
amplitudes, and waveforms of both specific 
and nonspecific cerebral evoked responses 
(Katzman, 1964) under various conditions. 
Finally, there always remains the possibility 
that new brain electrical phenomena will be 
discovered or new transformations of brain 
electrical data will be devised, which will 
yield better correlations than those noW 
known (Walter, Cooper, Aldridge, McCal 
lum, & Winter, 1964), 
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Vogel and Broverman critically examine Ellingson’s commentary on their article 
“Relationship between EEG and Test Intelligence.” They find to be unwar- 


ranted Ellingson’s conclusions that no 


substantial evidence exists of a relation- 


ship between EEG and IQ. It is held that Ellingson’s commentary is based 


essentially upon mistakes in regard to 


Ellingson (1966) claims that Vogel and 
Broverman’s (1964) conclusions of a relation- 
ship between EEG and intelligence are unten- 
able. His position is based upon a critique of 
the Vogel and Broverman article, and upon 
his own evaluation of the literature. We will 
attempt to demonstrate that Ellingson is in- 
correct both in his assessment of the Vogel 
and Broverman article, and in his assessment 
of the literature. First, however, we wish to 
comment on two themes which occur through- 
out Ellingson’s paper. 


The Comparative Value of Positive and Nega- 
tive Results 


Ellingson seems to compare the number of 
studies reporting significant relationships with 
the sum of the studies reporting no relation- 
ships, and, if the two totals are more or less 
equal, to conclude that “the weight of the 
evidence suggests no relationship.” Thus El- 
lingson cites six studies on the relationship of 
EEG to intelligence in normal adults, finds 
that three report significant relationships and 
three do not, and concludes there is no rela- 
tionship. We do not agree that a study report- 
ing no relationship between two variables 
tenders of null effect a study which does re- 
Port such relationships, unless one is a replica- 
tion of the other. The finding of a statistically 
Significant relationship requires some explana- 
tion; at an inescapable minimum it suggests 
that, given particular conditions, a relation- 
ship exists between the variables in question. 


On Ellingson’s Representation of Our Views 


Ellingson (1966) treats every finding of “no 
relationship” between EEG and intelligence as 
if it were a point against our position. He at- 


fact, and faulty assessment of the data. 


tempts to associate us with the argument that 
there should be demonstrable relationships be- 
tween all cognitive test results and all EEG 
indices under all conditions, which is virtually 
the diametric opposite of the position we do, 
in fact, advocate. Our paper (Vogel & Brover- 
man, 1964) indicates our opinion that while 
EEG is related to test intelligence, the rela- 
tionship will be more or less demonstrable de- 
pending upon (a) the controls which are in- 
stituted for relevant variables, such as sex 
and age, (b) the means employed to assess 
intelligence, (c) the conditions of EEG ad- 
ministration, and (d) the EEG indices em- 
ployed as dependent variables. 


METHODOLOGY 


We are pleased to note that Ellingson (1966, 
p. 92) voices general agreement with Vogel and 
Broverman (1964) on methodological ques- 
tions. However, Ellingson proceeds to make 
a number of methodological comments of his 
own, to some of which we take strong excep- 
tion. 

Ellingson writes that the “evidence for sex 
as a variable in EEG studies . . . is uncon- 
vincing and has not been generally confirmed.” 
In our opinion, he is incorrect. Vogel and 
Broverman (1964) cited a lengthy (but 
partial) list of studies which reported such 
relationships. As early as 25 years ago, Kreezer 
(1939) showed that failure to control for sex 
could seriously alter obtained relationships be- 
tween EEG and intelligence. 


Derived Indices 
We deny Ellingson’s (1966) implication 


that his comments upon “derived indices” are 
pertinent to the material we reviewed. 
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Strangely, Ellingson himself would seem to 
agree with us, for he writes “The author con- 
cedes that an experienced electrophysiologist 
can reliably identify the alpha rhythm and 
reliably measure its frequency, and can also 
determine alpha index and amplitude reliably 
for a given recording [p. 92].” In fact it is 
exactly the variables Ellingson mentions, 
measured under standard conditions, which 
were the focus of study in the papers reviewed 
by Vogel and Broverman. Consequently, the 
question of reliability does not seem to be 
relevant. It is true that some of the papers 
we cited investigated delta and theta phe- 
nomena, but Ellingson’s comments upon an 
“experienced electrophysiologist [being able] 
to reliably identify” particular phenomena is 
as true for delta and theta as it is for alpha. 
Only Netchine and Novikova employed cer- 
tain nonstandard EEG variables (“derived in- 
dices”), and here we agree that reliability 
measures should have been provided. 


Controls 


Ellingson (1966, pp. 92-93) argues for the 
necessity of employing a control for EEG 
abnormality. The problem is more complex 
than might appear. Approximately 15% of 
neurologically normal, control subjects evi- 
dence abnormal EEGs. It is not readily ap- 
parent why subjects who evidence abnormal 
EEGs in the absence of clinical organic pathol- 
ogy should be considered separately from sub- 
jects with normal EEGs. There is disagree- 
ment both as to what constitutes EEG 
abnormality, and as to the behavioral sig- 
nificance of EEG abnormality in the absence 
of clinical pathology. For example, certain 
EEG signs which have been traditionally re- 
garded as pathognomic have not been found 
to have any clinical or behavioral significance 
in the absence of overt clinical pathology 
(Obrist & Busse, 1964). Henry (1944), in an 
intensive study of normal children without 
brain disease who evidenced abnormal EEGs, 
concluded that no significance should be at- 
tached to abnormal EEG records evidenced 
by otherwise healthy children. 

Ellingson (1966, p. 93) writes that re- 
tarded subjects should not be studied in un- 
differentiated groups, but should be studied in 
groups with similar primary diagnoses. The 
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inference is that this control has hitherto not 
been observed in studies of the relationship 
between EEG and intelligence. However, vir 
tually every study which has reported sig- 
nificant relationships between EEG and intel- 
ligence has in fact either employed etiological 
types as an experimental variable or has con- 
trolled for it. The few exceptions (for ex- 
ample Netchine, Talan, Lairy, & Zazzo, 1959; ` 
Netchine & Netchine, 1962) have classified 
subjects into separate groups on the basis of 
test behavior or overt behavior. Consequently, 
Ellingson’s (1966) comments on this topic are 
not relevant to the studies reviewed by Vogel _ 
and Broverman (1964). 


THE EVIDENCE 


Relationships between EEG and Test Intel- 
ligence in Mental Defectives ) 


Ellingson’s review of the literature on re- 
tardates is marked by errors of fact; ironi- 
cally, he often seems most incorrect when he 
attributes error to Vogel and Broverman 
(1964). 

Ellingson (1966, p. 93) precedes his discus- 
sion of Kreezer’s (1936, 1937a, 1937b, 1938, 
1939, 1940) and Kreezer and Smith’s (1931, 
1950) papers with a critique of Kreezer's 
methodology. Ellingson suggests that the non- 
standard recording procedure reported in the 
1939 paper was the same that was utilized in 
all the Kreezer papers; this is not true. The 
Procedure was developed by Kreezer for use 
with mongols because of the difficulty in- | 
volved in persuading such persons to conform — 
to standard EEG instructions (stay awake, | 
relax, keep eyes closed). In a study in which 
a large number of higher-level persons were 
subjects, Kreezer and Smith (1937) employed 
the standard EEG instructions. At this point, 
it would seem that the question may be 
raised of the procedural adequacy of numer- 
ous EEG studies of retardates which employ 
the standard EEG instructions suggested by — 
Ellingson, yet which typically institute no 
checks to see if the instructions are in fact 
being followed. wy 

The fact is that in some ways Kreezer dis- 
Plays greater methodological sophistication — 
than investigations of 20 years later. In his 
later papers (1939, 1940; Kreezer & Smith, 
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1950), Kreezer systematically investigates the 
efiects of type of mental deficiency, chrono- 
logical age, mental age, electrode resistance, 
electrode location, and sex upon the relation- 
ship of EEG to intelligence; this is in addi- 
tion to variables which are controlled but not 
systematically investigated, such as conditions 
of EEG recording. Such extensive investiga- 
tions of the relationship of these variables to 
intelligence has not been reported in the 
literature since. 

Ellingson (1966, p. 94) tallies the number 
of “successes” and “failures” in Kreezer’s at- 
tempts to relate EEG to intelligence. We do 
not feel such a summary of results as Elling- 
son presents can ever result in a proper ap- 
preciation of the data. First, the issue is 
taised of whether a negative result has the 
same meaning or weight as a positive result. 
Second, the problem exists of whether every 
finding of “no relationship” should be counted 
as a failure. For example, Kreezer, from his 
earliest paper (1936), maintained that no re- 
lationship existed between alpha frequency 
and mental age in mongols, yet he computed 
such correlations for samples of mongols in 
order to demonstrate differences between MA 
and various EEG indices as a function of etio- 
logical type. 

Most important, however, Ellingson’s (1966, 
p. 94) tally of the results of Kreezer’s papers 
seems incorrect. We expect that the difficulty 
is partially due to Ellingson’s failure to give 
references; his failure to explain what he 
Means by the terms “success,” “partial suc- 
cess,” “allowing as best one can for over- 
lapping data”; and finally, by a simple failure 
to count significant results as significant re- 
sults. There is also a borderland where we 
ate no doubt in disagreement with Ellingson; 
that is, how one counts results which are re- 
ported by an author as a “success,” where the 
data can be demonstrated to be statistically 
Significant, but where no significance level is 
Specifically given. 

In our attempt to replicate Ellingson’s tally 
of Kreezer’s work, we have excluded from con- 
'Sideration: all findings for which a significance 
level is not specifically cited in Kreezer’s 
Papers; the two abstracts specifically excluded 
by Ellingson (Kreezer, 1937a; Kreezer & 
Smith, 1937); and one paper (Kreezer, 
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1937b)* not cited by Ellingson. We are still 
unable to replicate Ellingson’s figures and find 
him short in “successes.” For example, in re- 
gard to alpha frequency: on page 94, Ellingson 
cites “no successes in attempts to relate mental 
age (MA) and alpha frequency” in retardates. 
We, on the other hand, count two successes. 
Kreezer (1940) reported a correlation of .32 
($ < .05) between alpha frequency and MA 
for 46 familial mental deficients. In 1950, 
Kreezer and Smith reported a correlation of 
323 (p < .052, df = 34) between the same 
variables in another group of 36 familial 
mental defectives. Ellingson refuses to accept 
this latter result as significant, because he 
prefers “adherence to the p< .05 level of 
confidence [p. 94],” but, as may be seen, the 
level of significance is much closer to the 
$< .05 than to < .06, and we believe 
this finding would be accepted as significant 
by most psychologists. We consider the 1950 
paper a replication (a fact which is also dis- 
puted by Ellingson, p. 93) because both stud- 
ies employed retards of the same etiological 
type (familial defectives). 

In regard to the relationship between alpha 
index and MA, Ellingson credits Kreezer with 
“two successes,” but in one paper alone 
Kreezer (1940) reports three findings of sig- 
nificant results in three independent samples: 
a correlation of .348, V = 50; a correlation of 
of 324, N = 48; and a correlation of .72, 
N = 13 (all correlations have a p < .05). 

It must be stressed that these “successes” 
which we have cited in regard to alpha fre- 
quency and alpha index represent an ab- 
solutely minimal count which does not include 
findings which Ellingson apparently does not 
believe are worthy of consideration, or else 
possibly overlooks. However, further argument 
is of no avail since Ellingson does not specify 
how his count of “successes” and “failures” 
was attained. 

Ellingson (1966) writes that in the study 
of Bernhard and Skoglund (1939), the “find- 
ing of [8 out of 16] retardates with alpha 
frequencies below the normal range cannot 
be taken as more than suggestive in view of 
the small N and the lack of control for EEG 
abnormality [p. 94].” The fact is that in the 

1 This article was inadvertently omitted from the 
references of Vogel and Broverman, 1964. 


102 


Bernhard and Skoglund study, 4 of the re- 
tardates evidenced no alpha, so that only 12 
were employed in the analysis. These 12 were 
compared with 130 normal controls, and age 
was controlled. Eight of the 12 retardates had 
alpha frequencies which were lower than that 
evidenced by any of the normals. This evi- 
dence, we submit, is much more than sug- 
gestive. Ellingson’s complaint about the small 
N is surprising, since he accepts without com- 
plaint (see Ellingson, 1966, p. 94) a finding 
of “no relationship” reported by Berkson 
(1963) based on 13 normal controls and 12 
retardates. His complaint about a “lack of 
control for EEG abnormality” is also notable, 
since he accepts approvingly the findings of 
“no relationship” between alpha frequency 
and intelligence which were reported by Berk- 
son, Hermelin, & O’Connor (1961), who in- 
stituted no control for abnormality. In the 
Berkson et al. study, in fact, 3 of the 12 
retards in one group who received routine 
EEG exams evidenced abnormal EEGs. 

Ellingson’s verdict upon Gunnarson’s (1945) 
study is that it does not warrant discussion; 
however, Ellingson does not specify his com- 
plaint. The fact is that Gunnarson examined 
15 mongols, established a control for age, and 
found the alpha frequency below normal for 
all but one patient. His conclusion, given his 
data, would seem to be acceptable. 

We cannot accept Ellingson’s generaliza- 
tions (p. 94) concerning the Novikova (1956) 
article. Novikova studied 62 retards and 54 
normals, 13—16 years of age; and 59 retards 
and 46 normals, 9-12 years of age. She found 
that 75% of the normals but only 19% of 
the retards in the older group evidenced 
“pronounced alpha rhythm”; 50% of the 
normals but only 15% of the retards in the 
younger group evidenced “pronounced alpha 
rhythm.” We shall not catalogue the rest of 
her findings, many of which can be demon- 
strated to be statistically significant. Elling- 
son’s comment that Novikova “presented 
no specific alpha frequency data” is mis- 
leading. Novikova’s subjects were classified 
into mutually exclusive categories depend- 
ing upon the “quality” of the alpha rhythm: 
“pronounced alpha rhythm”; “weak alpha 
rhythm”; “alpha rhythm and slow waves”; 
“predominance of slow waves”; etc. It is 
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apparent that this continuum necessarily 
involved the frequency of the alpha waves 
since it is based upon the degree to whic 
the alpha frequency (8-12 cps) deteriorates 
and progressively becomes slower until it dis 
appears altogether and is replaced by slov 
waves (less than 7 cps). 

On page 94, Ellingson (1966) cites errors 
of omission by Vogel and Broverman (1964). 
The omissions consist of a table in the Gibbs, 
Rich, Fois, and Gibbs (1960) article; a 
paper by Berkson et al. (1961), and appar 
ently unpublished data (Lindsley & Berkson, 
cited by Berkson, 1963). In regard to the 
Gibbs et al. (1960) paper, Ellingson writes: 
“ [Vogel and Broverman] did not mention that 
Gibbs et al. found no difference in alpha dis 
tribution between normal adults and undiffer 
entiated adult retardates . . . [p. 94].” His 
reference is to a bar graph of alpha frequency 
distribution in 1,118 retardates as compared 
with 2,190 normal controls (Gibbs et al., 1960, 
p. 240, Figure 3). We confess to having over 
looked this material. The data are, in fact, 
represented by Gibbs et al. (1960) as being 
indicative of “no difference,” although they 
performed no statistics. A chi-square analysis 
done on these data by the present authors in 
fact reveals that the two distributions are sig- 
nificantly different at better than the .0000! 
level of confidence (chi-square = 137.95, with 
10 df). The observed compared to expected 
frequencies of only two cells of the 2 X 11 
table (those cells contrasting the relative oč 
currence of the slowest alpha frequencies it 
the two samples) produce a chi-square 
over 65, indicating adult retardates have 
slower alpha frequencies than adult normals 

Ellingson (1965) notes we did not cile 
Berkson, Hermelin, and O’Connor (1961) wh? 
“found no differences among controls all 
three groups of young-adult retardates with 
respect to alpha frequency or alpha index.: 
[p. 94].” We admit to having overlooked this 
study. Ellingson does not mention that | 
would have been unlikely for these authors t 
have found significant relationships betwe! 
alpha frequency and any other variable, sin® 
the alpha frequency was measured on the bas® 
of five 1-second samples, which is an & 
tremely brief time sample, and is one wii 
would make the alpha frequency meast? 
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; unreliable. Several of Berkson et al.’s (1961) 
patients in one sample admittedly had abnor- 
_ mal EEGs; a fact which, according to Elling- 
son, must necessarily invalidate the use of that 
sample. 

On page 95, Ellingson (1966) cites us as 
noting that the absence of a control for 
chronological age in the Netchine et al. (1959) 
study rather compromises their results. He 
writes, “That this omission was significant is 
suggested by the fact that when alpha index 
was correlated with WISC IQ the correla- 
tions tended to disappear.” The two correla- 
tions to which he refers are between alpha 
index and the WISC Verbal and Performance 
scales, One of the two correlations that “tended 
to disappear,” is in fact significant at the 
$ <.05 level of confidence, indicating a posi- 
tive relationship between alpha index and 
scores on the WISC Performance scale, with 
age controlled. 

In the same paragraph, Ellingson notes that 
he is unable to find the information to support 
our statement that Netchine et al.- (1959) 
“demonstrated that although the correlations 
between MA and occipital, parietal, rolandic, 
and frontal alpha frequency were all signifi- 
cant, occipital alpha frequency provided lower 
correlations with MA than did measures of 
alpha frequency which were obtained from 
Parietal, rolandic, or frontal leads [Vogel & 
Broverman, 1964, p. 134].” The table Elling- 
son seeks is Table 4, on page 362. All of the 
correlations referred to are significant at the 
><.01 level or better, There was an error 
which we did not catch that justifiably con- 
fused Ellingson: for the term “alpha fre- 
quency” in the above quote, substitute the 
tem “alpha index.” Otherwise the statement 
1S correct, 

Next, Ellingson (1966) discusses the 
Netchine and Lairy (1960) article. He clouds 
the issue by including this article in his dis- 
cussion of feebleminded persons, since only 
44 of Netchine and Lairy’s 209 subjects could 
tightly be termed retarded. Ellingson’s com- 
Ments are presented in a complicated form, but 
“sentially he attempts to make two points. 

1. “The data were not based on alpha fre- 
quency but on occipital frequency, which is 
not necessarily the same thing [p. 95].” In 
fact, a study of Table 2 (Netchine & Lairy, 
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1960, p. 430) indicates that in this case 
Netchine and Lairy were using the two terms 
interchangeably; “occipital frequency” seems 
to have been an abbreviated term for “oc- 
cipital alpha frequency.” The means, medians, 
and upper ranges for all 12 groups are in- 
disputably in the alpha range; the low ranges 
given are compatible with slow alpha fre- 
quencies. Had Netchine and Lairy been study- 
ing samples of all occipital frequencies which 
were included in their 5-minute EEG time 
samples, as Ellingson is implying, some beta 
values would be found for a majority of the 
subjects. As is apparent from Table 2, no 
beta values are being reported, not even for 
a single subject. 

2. Ellingson quotes our (1964) statement 
that 
Both positive and negative significant relationships 
between alpha frequency and intelligence have been 
reported by Netchine and Lairy (1960) in a sample 
of 209 children who . . . ranged in intelligence from 
defective through superior. The correlation between 
alpha frequency and intelligence at age 5-6 years 
was —.34, at age 7-8 years, .20; at age 9-10, .30; at 
age 11-12, .38 (all correlations, except that at age 7-8 
years, are statistically significant) [p. 135]. 
Ellingson argues that the correlations were 
based on a total of 67 subjects, with Ns for 
the four age groups of 7, 16, 25, and 19, 
respectively. We are at a loss to understand 
from where Ellingson obtained his incorrect 
information. The correlations are in fact based 
on the full group of 209 subjects: Ws in the 
four age groups are 35, 54, 68, and 52.? The 
statistical significance of the correlations is as 
represented in the Vogel and Broverman 
article. It is true that “Netchine and Lairy 
did not comment on the statistical signifi- 
cance of the above coefficients,” but since the 
N is known, and the size of the Spearman p 
correlations is known, it presents no problem 
to determine the significance level. 

In regard to the Netchine and Netchine 
(1962) article, Ellingson (1966, p. 95) im- 
plies error on the part of Vogel and Brover- 
man where in fact there is none. Ellingson 
quotes our statement that “Netchine and 
Netchine (1962), employing another EEG 
index based upon alpha rhythm and an amp- 

2Netchine confirms that the Ws in these analyses 


were as represented here by Vogel and Broverman. 
Personal communication, September 15, 1965. 
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litude measure, again successfully differenti- 
ated high- and low-grade feebleminded sub- 
jects.” In his comments, Ellingson is appar- 
ently suggesting, first, that the two groups 
were not composed of high- and low-grade 
retards, and second, that the EEGs did not 
differ, In regard to the first point, it is true, as 
Ellingson says, that the two groups were 
“matched for MA and IQ, but differed in 
terms of performance on perceptual motor 
tests”; Netchine’s point was that feeble- 
minded persons of equivalent IQ might still 
differ in terms of particular intellectual skills 
and consequently, differ radically in prog- 
nosis and adaptive behavior. Second, in regard 
to the finding to which we refer in the quote 
above, we were in all respects correct: the 
finding was significant at p < .02 (Netch- 
ine & Netchine, 1962, pp. 247-248). Elling- 
son’s comment that “the two groups did not 
differ with respect to alpha frequency or post- 
anterior voltage gradient” refers to two addi- 
tional analyses, of which the second (assum- 
ing Ellingson and we are referring to the same 
finding, since numerous results were reported 
in this article) was of borderline significance 
(p < .10). There were in fact numerous other 
instances in which the two groups differed 
significantly in terms of EEG, but neither 
here nor in our original article are we inter- 
ested in cataloguing all the significant results 
reported in the literature. 

To summarize, Ellingson’s critique of Vo- 
gel and Broverman (1964) is based almost 
entirely upon error. Further, the three new 
findings cited by Ellingson either support our 
case (Gibbs et al., 1960), evidence methodo- 
logical difficulty (Berkson et al., 1961), or 
are apparently unpublished studies which 
Ellingson cites from secondary sources (Lind- 
sley & Berkson, cited by Berkson, 1963). It 
would seem that Vogel and Broverman’s 
(1964) analyses and conclusions were sub- 
stantially correct, in regard to the relationship 
of EEG and intelligence in retarded persons. 


The Relationship between EEG and Test 
Intelligence in Children 


Ellingson’s (1966, p. 96) conclusion that 
the relationship of EEG to test intelligence in 
children is “contradictory and inconclusive,” 
is due partially to his failure to take into ac- 
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count relevant differences between the studies 
Mainly, however, his conclusions are due toa 
presentation of the data which conceals rather 
than reveals existing relationships, and toa 
failure to accept evidence that, given objective 
reception, would ordinarily be taken as sig 
gesting a relationship between EEG and inte 
ligence. Ellingson (1966, p. 95) clouds th 
fact that only three studies have been reported 
in the literature which have investigated th 
relationship of EEG to MA in normal chi 
dren, and which have employed the precaution 
of controlling for chronological age: all thre 
have reported significant relationships of EEG 
to intelligence (Henry, 1944; Knott, Fried 
man, & Bardsley, 1942; Netchine & Laity 
1960). 

The findings of Netchine and Lairy (1960) 
and Henry (1944) of inverse relationships i 
normal children between slow activity (delti 
or theta) and intelligence stand uncontra 
dicted in the literature. 

Henry (1944) studied children of 8, 9, M 
and 11 years and reported data for the rela 
tionship between MA versus occipital an 
central delta at each of the four age levels. Hi 
reported a — .60 correlation at age 9 betwee 
MA and occipital delta (p < .05); a — 41 
correlation at age 9 between MA and centra 
delta ( < .01), a — .64 correlation at age I 
between MA and central delta (p < .01), an 
a — .38 correlation between MA and centra 
delta at age 11 (p> 20< 10). The othe 
four correlations did not border on signif 
cance. In the context of reporting this wot! 
Ellingson (1966, p. 95) quotes Henry to th 
effect that “In view of the disconcertin 
manner in which such correlations shift abou 
at different ages, discretion prompts the mo! 
conservative conclusion that there is no dem 
onstrated relationship.” In fact, this co 
cluding sentence of Henry’s assumes a diffe 
ent and less negative coloration when it! 
read in the context of the rest of Henry 
discussion, When Henry concluded “there i 
no demonstrated relationship” he was n 
denying the significance of the cited correélé 
tion coefficients, but was referring to the lat 
of evidence for a simple, direct, significa! 
relationship which embraced all age level 
However, in view of the fact that both EE 
and intellectual processes in children sho 


dramatic changes with small increments in 
ag, it would be most surprising if the rela- 
tionship between any given EEG index and 
1Q in children remained invariable at all age 
levels. 
Ellingson skips lightly over the work of 
Netchine and Lairy (1960), who present data 
for sow EEG frequencies (occipital theta) on 
209 children divided into four age levels (5-6, 
1-8, 9-10, 11-12) and three intelligence lev- 
ds (IQ < 75, 75-100, > 100). Ellingson’s 
statement (p. 95) that Netchine and Lairy 
reported “higher theta . . . in their low IQ 
groups; presumably the differences were not 
significant elsewhere” is unclear if not mis- 
leading, given the text and the table involved 
(Table 4, p. 432). As a matter of fact, at 
every age level, the high IQ group (IQ > 100) 
evidenced less theta than either the middle 
(IQ 75-100) or low (IQ < 75) group; the 
middle IQ group, in turn, evidenced less than 
the low IQ group in each case, with the excep- 
tim of the 7-8-year level. Netchine and 
lairy are unclear as to significance levels, but 
given the figures it becomes difficult to believe 
that the differences between the high IQ 
group and the other two groups are not highly 
significant. At age 5-6 years, the high IQ 
group evidenced 18.57% theta as compared 
with 28.66% for the middle IQ group and 
31.53% for the low; at age 7-8, the respec- 
tive percentages were 11.87, 28.36, and 23.75; 
at age 9-10, 7.3, 16.42, and 24.61; at age 
11-12, 1.5, 5, and 15. 
In regard to the relationship between alpha 
quency and intelligence, both Knott, 
Friedman, and Bardsley (1942) and Netchine 
ad Lairy (1960) found significant relation- 
ps at various age levels. In Netchine and 
lairy’s (1960) study, the relationship be- 
tween occipital alpha frequencies and MA was 
Significant at three of the four age levels 
tested; Knott et al. (1942) found a signifi- 
cant relationship between alpha frequency and 
at one of two age levels tested. Only 
Henry (1944) failed to find relationships in 
dren between alpha frequency and intelli- 
fence, As we suggested previously (Vogel & 
Broverman, 1964), this lack of uniformity in 
"sults may be attributable to wide sample 
erences, differences in intelligence tests 
employed, EEG techniques, etc. 


RELATIONSHIP BETWEEN EEG AND TEST INTELLIGENCE 


105 


Relationship between EEG and Test Intelli- 
gence in Normal Adults 


Ellingson (1966, p. 96) summarizes the 
data the same as did we: both of us cite 
significant positive relationships between in- 
telligence and alpha frequency reported by 
Mundy-Castle (1958) and Mundy-Castle and 
Nelson (1960); a negative relationship re- 
ported by Sugarman (1961) and no signifi- 
cant relationships reported by Shagass (1946), 
Biesheuvel and Pitt (1955), and Gastaut 
(1960).* Ellingson concludes “The weight of 
available evidence suggests no relationship in 
normal adults.” Vogel and Broverman (1964) 
concluded that the evidence for the relation- 
ship of EEG to intelligence was “weaker for 
samples of normal adults” than for the other 
samples (children, brain-damaged persons, 
etc.) for whom data have been reported. We 
did not realize it at the time, but taken from 
its proper context in our section on “Normal 
Adults” (pp. 136-137) and from the context 
of our “Discussion” (pp. 139-142) this con- 
clusion sounds more ambiguous and less posi- 
tive than we had intended. In fact, our posi- 
tion is that the EEG is related to intelligence 
in normal adults, but that the nature of the 
relationship is complex and has tended to be 
concealed by methodological problems in- 
volved in the measurement of intelligence. El- 
lingson (1966) reaches his conclusion only by 
overlooking salient differences among the sey- 
eral studies. 

Ellingson (1966) did not mention that, of 
this group of studies, the two articles from 
Mundy-Castle’s laboratory are the most thor- 
oughly executed and the most thoroughly 
reported, and that the results of the second 
study largely replicated those of the first. 
Both of these articles employed the South 
African version of the Wechsler-Bellvue Adult 
Intelligence Test; in both studies significant 
positive relationships were obtained at the p 
< .01 level of confidence between the total 
IQ score and alpha frequency. Additional 
significant relationships were obtained in both 
studies between alpha frequency and various 
of the 11 subtests, including Vocabulary. 


8 Gastaut (1960) is an English-language abstract of 
the experiment reported in full by Gastaut et al, 
1959. 
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Mundy-Castle (1958) obtained no signifi- 
cant relationship between alpha index and the 
total IQ score, but it is of the greatest in- 
terest to examine the relationships which 
were obtained between alpha index and the 
various Wechsler subtests: the correlations 
ranged from significant negative relationships 
with both the Picture Completion and Pic- 
ture Arrangement subtests to significant posi- 
tive relationships with the Arithmetic sub- 
test. 

The results of the Mundy-Castle articles, 
particularly the results obtained with alpha 
index, illustrate our point that the relation- 
ship between EEG and adult test intelligence 
will be variable, depending upon the nature of 
the intelligence tests which are employed. In- 
telligence is not a unitary trait but is fac- 
torially extremely complex. The correlations 
among the various Wechsler subtests and 
between the Wechsler-Bellvue and other 
measures of normal adult intelligence (Wech- 
sler, 1944) indicate considerable variance 
unique to each separate measure of intelli- 
gence. It should, therefore, come as no sur- 
prise that studies which employ different tests 
will not obtain comparable correlations be- 
tween EEG and intelligence, and that Mundy- 
Castle’s results should differ from those of 
other investigators. 

Sugarman (1961), employing the South 
African Group Intelligence Test (SAGIT), ob- 
tained a significant megative correlation be- 
tween alpha frequency and the total test 
score in a group of 35 university students, but 
not in a group of 15 staff members. Ellingson 
apparently sees this as contradicting the re- 
sults of Mundy-Castle (1958) and Mundy- 
Castle and Nelson (1960) who obtained sig- 
nificant positive relationships. However, as 
nearly as we can determine, the factorial struc- 
ture of the SAGIT has not been established. 
Considering the fact that Mundy-Castle 
(1958) obtained both significant positive and 
significant negative relationships in the same 
sample between alpha index and test intelli- 
gence depending upon which Wechsler subtest 
was involved, it cannot be ruled out that the 
SAGIT may contain a factorial component 
which is negatively related to alpha fre- 
quency, while at the same time the tests em- 
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ployed by Mundy-Castle contain factorial 
components which are positively related t 
alpha frequency. 

Shagass (1946) found no significant relè- 
tionship between alpha frequency and tk 
Royal Canadian Air Force Classificatie 
Test. However, he cites as a “factor whic 
limits the conclusions to be drawn from th 
above results [that] the extent to which th 
Classification test correlates with accepted 
individual tests of intelligence is not fully 
known,” an important point in view of th 
foregoing discussion. Biesheuvel and Pitt 
(1955) found no relationship between the 
Raven’s Progressive Matrices and alpha fre 
quency; however the Raven’s Matrices cat- 
not be called a comprehensive test of intelli- 
gence. 

Finally, Gastaut et al. (1959) reported m 
relationship between EEG and intelligent 
in French army recruits. The only intelligent 
tests which Gastaut and his associates at 
tempted to relate to EEG were French trans 
lations of several of Thurstone’s Primary 
Mental Abilities subtests, which were founi 
not to correlate with EEG. In addition 0 
questions which naturally arise in regard to 
the validity and reliability of intelligence tests 
translated from another language, another 
problem arises, The relationship of one 0! 
Thurstone’s tests (and very possibly also the 
others) with EEG variables was significan 
more curvilinear than linear. However, th 
only significance tests reported by Gastall 
are for linear relationships. Consequently 
there is plainly serious question as to the 
quacy of his statistical analysis. 

In summary, we feel that Ellingson’s cot 
clusion that the “weight of available evident 
suggests no relationship in normal adults”! 
unfounded. It is an untenable argument be 
cause it does not take into account the unt! 
puted fact that intelligence is multifactorial 
and that specific measures of intelligent 
correlate differentially with any given ai 
terion. There is nothing in the literature t 
contradict the findings reported in the tW 
Mundy-Castle studies (1958, 1960) ot tH 
Sugarman study (1961), given the fact th 
the tests used were different from those e 
ployed in the Gastaut et al. (1959), Bieshet 
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vel and Pitt (1955), and Shagass (1946) arti- 
des and that the relationships between these 
various intelligence tests are unknown. 


Studies of the Relationship between EEG and 
Intelligence with Geriatric and Brain-Dam- 
aged Subjects 


Ellingson (1966) dismisses our findings 
with respect to geriatric and brain-damaged 
patients with the words “the evidence .. . 
essentially illustrates the facts, which the 
author has already pointed out, that patients 
with organic brain disorders tend to have ab- 
normal EEGs and tend to display decreased 
intellectual capacity [p. 96].” His dismissal 
of this body of literature appears to us to be 
inappropriate. 

First, one of the two sections to which 
Ellingson refers was entitled “Relationships 
between EEG and Intelligence in Older Per- 
sons.” The fact that a person is old does not 
necessarily mean he evidences either “organic 
brain disorder” or even “decreased intellec- 
tual capacity.” Our definition of “old” or 
“elderly” was operational and was that of the 
studies we cited; any study which involved 
only persons of no less than 60 years we desig- 
nated as a study of “older persons.” We care- 
fully noted in each case whether any given 
study involved normal older persons or ill 
older persons, or both. In addition, normal as 
well as abnormal brain phenomena were ex- 
amined in these studies. Consequently, it is 
unwarranted for Ellingson to dismiss this 
body of literature on the grounds that it 
teveals nothing but a link between brain 
damage, EEG abnormality, and poor intel- 
lectual performance, 

Second, in regard to our section on persons 
with “organic brain disease,” a reading of this 
section will show that our primary concern 
was not whether persons with abnormal EEGs 
evidenced low intelligence. We focused pri- 
marily on the issue of what kind of EEG ab- 
Normality was related to what kind of intel- 
lectual impairment, which is an important 
problem worthy of attention. Such questions 
ate involved as the localization in the brain of 
Particular intellectual functions; the relation- 
ship between epileptic EEG patterns and spe- 
cific intellectual defects; etc. 
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DISCUSSION 


Ellingson (1966, p. 96) states that the 
relationship between EEG and intelligence is 
absent in normal adults; contradictory and 
inconclusive in retardates and in children; 
and that relationships between abnormal 
EEGs and impaired intelligence in older and 
brain-damaged persons are a function of brain 
disease. We feel the first two conclusions are 
at variance with the data. His third conclu- 
sion is irrelevant to the studies reviewed in 
our article. 

Our own conclusions are those of the previ- 
ous article (Vogel & Broverman, 1964). El- 
lingson has not introduced any new evidence 
nor presented any arguments which would 
require us to modify those conclusions. In 
short, there does seem to be a relationship 
between EEG and intelligence, but the nature 
of the obtained relationship is dependent 
upon a number of factors: the means of intel- 
lectual assessment, sample differences, the 
EEG indices employed, etc. In order that the 
nature of the relationship may become better 
understood, it is necessary that further re- 
search be done which employs methodologi- 
cal improvements such as Vogel and Brover- 
man (1964) suggested: more attention needs 
to be paid to the problem of intellectual as- 
sessment; placement of EEG leads; the as- 
sessment of EEG performance under appropri- 
ate conditions; and the appropriate use of 
EEG indices. For a discussion of each of 
these proposals, the reader is referred to the 
original paper. 

In suggesting that evidence of a relation- 
ship between EEG and intelligence exists, 
Vogel and Broverman (1964) are calling into 
question a consensually dominant assumption 
concerning the nature of the EEG; namely, 
that the EEG is a primitive bioelectric func- 
tion of neural tissue and, as such, cannot be 
expected to relate to higher functions such as 
intelligence or personality. Ellingson (1954, 
1956, 1966) has consistently associated him- 
self with this position which has dominated 
the field for some time. 

It is our opinion that this conviction con- 
cerning the nature of the EEG is an assump- 
tion which has been present a priori from the 
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first days of EEG study. The scarcity of evi- 
dence that the EEG is related to more com- 
plex forms of behavior has been due less to a 
failure to find relationships which have been 
sought than it has been due to a failure to 
initiate research, Mundy-Castle (1958) came 
to a similar conclusion: “. . . it would ap- 
pear that the general assumption that there is 
no such relationship [between EEG and in- 
telligence, among normal adults] rests on in- 
adequate experimentation [p. 194].” 
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3 myths prevalent in psychotherapy research are considered and refuted. These 
include the uniformity assumptions, spontaneous remission of psychoneurosis, 
and the belief that present theoretical formulations provide adequate paradigms. 
Several other confusions are listed and clarified including the process-outcome 
distinction, the classification problem, and the expectation of the definitive 
study in therapy research. Finally, an attempt is made to delineate the minimal 
requirements of any psychotherapy research paradigm by incorporating present 
empirical evidence as well as by specifying common sources of confounding 


vis-a-vis the therapy interaction. 


One of the unfortunate effects of the prolific 
and disorganized psychotherapy research 
literature is that a clear-cut, methodologically 
sophisticated, and sufficiently general para- 
digm which could guide investigations in the 
area has not emerged. Perhaps this is an un- 
avoidable state of affairs in a new area of 
research, Yet a perusal of this literature indi- 
cates that most of the basic considerations 
necessary for a general paradigm have ap- 
peared, albeit in many cases parenthetically, 
at some place or another. But to date no one 
has attempted to integrate empirical findings 
and methodological concerns in a way that 
might lead to a useful research paradigm. This 
lack of integration of the paradigm ingredi- 
ents has minimized their impact on investi- 
gators in the area. Moreover, concomitant 
with, and perhaps because of, the absence of 
a paradigm several myths have been perpetu- 
ated; and because of these myths, inadequate 
designs continue to appear. 

This paper will first attempt to spell out 
in some detail several myths current in the 
area of psychotherapy research which con- 
tinue to weaken research designs and confuse 
the interpretation of research findings. Sec- 
ondly, it will attempt to present a minimal 
but general paradigm which takes into ac- 
count current theoretical inadequacies and 
empirical learnings, and focuses on methodo- 
logical considerations which can no longer 
continue to be ignored. 


Some MYTHS or PSYCHOTHERAPY RESEARCH 


We should be wary of pseudo-quantifications and 
methodological gimmicks which often tend to close 


off prematurely an area of inquiry and give rise to 
the illusion that a problem has been solved, whereas 
the exploration has barely begun. [Strupp, 196}, 
p. 470]. 


This first section is devoted to enumerating 
and refuting several misconceptions about 
psychotherapy research which have tena- 
ciously persisted. These myths have served the 
unfortunate purposes of confusing the con 
ceptualization of psychotherapy, hence im- 
peding research progress; and of spreading 
pessimism regarding the utility of further 
research. Let us deal with these myths in tum. 


The Uniformity Assumption Myths 


This misconception was first labeled by 
Colby (1964) although it has been alluded 
to by several other authors (Gendlin, 1966; 
Gilbert, 1952; Kiesler, 1966; Rotter, 1960; 
Strupp, 1962; Winder, 1957). Colby only 
parenthetically mentioned the myth, and al- 
luded to only one of its aspects, that regard- 
ing patients in psychotherapy research. This 
paper will extend its meaning to cover the 
psychotherapy treatment itself. The formet 
is referred to as the Patient Uniformity AS 
sumption, the latter the Therapist Uniformity 
Assumption, 

Patient Uniformity Assumption. For Colby; 
the Uniformity Assumption refers to the 
belief that “patients at the start of treatment 
are more alike than they are different.” This 
implicit assumption has led to a remarkably 
naive manner of choosing patients for psych” 
therapy research: patients are not selected, 
rather they pick themselves by a process ° 
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natural selection. In searching for patients for 
investigating psychotherapy, researchers have 
traditionally chosen available samples, such 
as any patient coming to a clinic over a 
certain period of time. In some cases patients 
are further divided into experimentals and 
controls, But, in most studies, all patients 
receive therapy, measures being taken pre 
and post in order to reflect its efficacy. In 
any case, the assumption has been made that 
by this procedure one obtains a relatively 
homogeneous group of patients differing little 
in terms of meaningful variables, and homo- 
geneous simply because they all sought out 
psychotherapy (be it in a counseling service, 
outpatient clinic, mental health clinic, private 
practice, or what have you). 

Far from being relatively homogeneous, pa- 
tients coming to psychotherapy are almost 
surely quite heterogeneous—are actually 
much more different than they are alike. The 
assumption of homogeneity is unwarranted 
since on just about any measure one could 
devise (demographic, ability, personality, 
etc.) these patients would show a remarkable 
range of differences. This is apparent from 
dinical experience and from much of the evi- 
dence on initial patient characteristics that 
is available today. Because of these initial 
patient differences, no matter what the effect 
of psychotherapy in a particular study (be 
it phenomenally successful or a dismal 
failure) , one can conclude very little if any- 
thing, At best, one can say something such 
as; for a sample of patients coming to this 
Particular clinic over this particular period, 
Psychotherapy performed by the clinic staff 
during that period on the average was either 
Successful or unsuccessful. No meaningful 
conclusions regarding the types of patients 
for whom therapy was effective or ineffective 
ate possible, This is inevitably the case since 
no patient variables crucially relevant for 
Subsequent reactivity to psychotherapy have 
been isolated and controlled. 

This Patient Uniformity Assumption ham- 
pered research in the area of schizophrenia 
for years, The assumption was that patients 
diagnosed as schizophrenic are more alike 
than different, Subsequent data showed very 
clearly that some schizophrenics were quite 
different from others, in fact more like nor- 
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mals than they were like other schizophrenics 
(Herron, 1962). In other words, extreme 
variability was the usual case when one 
lumped all patients diagnosed schizophrenic. 
It was only when this variability could be 
reliably reduced that useful research could 
begin to be done. The important empirical 
finding was that schizophrenic patients dif- 
fered markedly with respect to the abruptness 
of onset of their disorder. Some could be 
reliably classiñed in terms of their case his- 
tories as “process” schizophrenics, those with 
a relatively long-term and gradual onset; 
while others had a quite short-term and 
abrupt onset, “reactive” schizophrenics. It 
was further discovered that the reactive 
schizophrenics had much better prognosis. 
But most importantly, the radical reduction 
in variability permitted by this operational 
distinction made possible for the first time 
research that could lead to replicable differ- 
ences among process schizophrenics and other 
diagnostic groups. 

Now, few would argue that for some pa- 
tients psychotherapy is not effective. Clinical 
experience as well as research data point 
clearly to this conclusion. Many studies have 
shown patient differences evident at the be- 
ginning of psychotherapy which are crucially 
related to subsequent dropout or failure in 
therapy.1 If psychotherapy is differentially 
effective depending on initial patient differ- 
ences, as the evidence strongly suggests, then 
it seems clear that research should take these 


11n the following discussion extensive bibliograph- 
ical citings will be omitted since the resulting list 
would be prohibitive. Fortunately, an exhaustive 
bibliography of the psychotherapy research literature 
has now appeared and is available (Strupp, 1964). 
From the topical index provided one can find an 
exhaustive bibliography of studies for each of the 
therapy variables discussed below. Also, several spe- 
cific reviews provide some integration of the many 
studies and can be usefully consulted (Auld & Mur- 
ray, 1955; Breger & McGaugh, 1965; Cartwright, 
1957; Eysenck, 1961; Gardner, 1964; Goldstein, 
1962; Grossberg, 1964; Herzog, 1959; Marsden, 
1965; Stieper & Wiener, 1965; Zax & Klein, 1960). 
Finally, the chapters on “Psychotherapeutic Proc- 
esses” and “Counseling” in the Annual Review of 
Psychology (1950-1965) as well as the American 
Psychological Association’s two volumes on Research 
in Psychotherapy (Rubinstein & Parloff, 1959; 
Strupp & Luborsky, 1962) are both indispensible 
sources for critical reviews of research. 
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differences into account. This would imply 
the use of a design with at least two experi- 
mental groups, dichotomizing patients by one 
or more patient variables shown to relate to 
subsequent outcome; or matching experi- 
mentals and controls on these relevant initial 
patient measures. Meaningful results will not 
occur if one continues to aggregate patients 
ignoring the meaningful variance of relevant 
patient characteristics, 

Therapist Uniformity Assumption. Perhaps 
an even more devastating practice in psycho- 
therapy research has been the selecting of 
various therapists for a research design on 
the assumptions that these therapists are 
more alike than different and that whatever 
they do with their patients may be called 
“psychotherapy.” Theoretical formulations 
seem to have perpetuated this assumption 
since they have traditionally focused on de- 
scribing “The Therapy” which is ideally ap- 
propriate for all kinds of patients. The myth 
has been perpetuated that psychotherapy in 
a research design represents a homogeneous 
treatment condition; that it is only necessary 
for a patient to receive psychotherapy, and 
that mixing psychotherapists (whether of the 
same or different orientations) makes no 
difference. 

This kind of loose thinking seems to have 
grown out of what Raimy (1952) has called 
the “egg-shell era” of psychotherapy wherein 
it was rare for a therapist to tape record his 
sessions or in other ways to make public his 
therapeutic behavior. 


Almost without exception . psychotherapists 
adopted the attitude that patients and clients were 
frail, puny beings who would flee the field if anyone 
except their own private psychotherapist touched 
them. In view of the lack of evidence to the con- 
trary, such an attitude was at one time entirely 
justified since, in simple ethics, the welfare of the 
patient does come first. The fact that the attitude 
also had other supports, particularly the fact that 
it provided defensive measures for the therapist’s 
unsteady ego-structure, made no difference [Raimy, 
1952, p. 324]. 


In this aura of mystery it became easy for 
one to think of “the One” psychotherapy 
which would maximally benefit all patients. 
As Colby (1964) relates: “As long as what 
went on in therapeutic sessions remained 
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secret, myths of consensus within paradigms 
were easily perpetuated. For example, among 
psychoanalysts there grew the myth ofa 
single agreed-upon perfect technique [p 
348].” The advent of general tape-recording 
has made records of the therapist’s behavig 
progressively more available to other di 
nicians and researchers, and it has become 
apparent that differences in technique and 
personality exist, even within schools, and 
that disagreement prevails. 

Despite this token admission of therapist 
differences, the Uniformity Assumption stil 
abounds in much psychotherapy research, 
Patients are still assigned to “psychotherapy” 
as if it were a uniform, homogeneous treat- 
ment, and to psychotherapy with different 
therapists as if therapist differences were 
irrelevant. As Rotter (1960) observes: 


Although the trend is generally away from the notion 
of psychotherapy as an entity, there is still too mud 
concern with the process of therapy. For many therë 
is an assumption that there is some special proces 
which takes place in patients, accounting for thet 
improvement or cure. . . . In a similar way the rok 
of the therapist is conceived of as one ideal set of 
behaviors which will maximally facilitate the mys 
terious process, The alternative conception that psy- 
chotherapy is basically a social interaction whith 
follows the same laws and principles as other social 
interactions, and in which many different effects cat 
be obtained by a variety of different conditions, Ï 
frequently neglected [p. 408]. 


In addition, the Myth ignores the growing 
body of evidence that psychotherapists att 
quite heterogeneous along many dimensions 
(e.g., experience, attitudes, personality vari- 
ables) and that these differences seem to it- 
fluence patient outcome. Some therapy may 
be more effective than others; and it seems 
not too unlikely that some therapy may be 
more deleterious than no therapy at 
(Rogers, Gendlin, Kiesler, & Truox, 1966). 
As Meehl (1955) states: “In our present 
ignorance it is practically certain that clients 
are treated by methods of varying appropti 
ateness, largely as a function of which thera 
pist they happen to get to. Also, it is prac 
tically certain that many hours of skilled 
therapists are being spent with unmodifiable 
cases [374].” 

__ If psychotherapy research is to advan 
it must first begin to identify and measurè 
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these therapist variables so relevant to 
eventual outcome (personality characteristics, 
technique factors, relationship variables, role 
espectancies, and the like), One can then 
build therapist differences into his design by 
having several experimental groups, with 
therapists at respectively different levels of 
relevant dimensions. Or, one can match thera- 
pists on relevant factors. To continue the 
practice of assigning some patients to psycho- 
therapy and others to a “control” group 
seems futile. 

Unfortunately, this kind of naive conceptu- 
alization has dominated studies evaluating the 
eects of psychotherapy where therapy and 
control groups are compared. Typically, the 
results have been discouraging in that no 
significantly greater improvement has been 
demonstrated for the “therapy” patients 
(Eysenck, 1961). As Gendlin (1966) argues: 


Research in psychotherapy has suffered from the fact 
that psychotherapy was not definable. It has meant 
that, if an experimental therapy group was com- 
pared to a non-therapy control group, some of the 
supposed therapy subjects were not really receiving 
something therapeutic at all. . . . The effect of aver- 
aging the changes in the “experimental” group as 
compared with the “control” group often showed no 
significant differences. To bring this home, imagine 
trying to investigate the effects of a drug with an 
experimental group taking the drug and a control 
group receiving a placebo. Imagine that some (per- 
haps half) of your “experimental” group are actually 
taking a preparation without the effective ingredients 
of the drug... and you don’t know which ones 
these are. Then, too, perhaps one or two “controls” 
are actually getting the drug on the side. Your 
“experimental treatment” group is not always getting 
the treatment. 


An implication of this consideration is that 
if one could analyze the variability (rather 
than or in addition to the mean differences) 
of the experimental and control groups in 
these many studies, the therapy groups would 
be expected to show a much greater range of 
improvement behavior than the controls. This 
would follow if some of the “experimental” 
Patients indeed were exposed to some quite 
“bad or indifferent psychotherapy.” 

2 Sanford (1953) states this still differently: 
From the point of view of science, the ques- 
tion ‘Does psychotherapy do any good?’ has 
little interest because it is virtually meaning- 
less... . The question is which people, in 


what circumstances, responding to what 
psychotherapeutic stimuli.” And Gilbert 
(1952) urges: 


One of the extremely important problems in psycho- 
logical counseling and therapy is that concerning 
the need for being able to describe and quantify 
different types of psychotherapeutic relationships. 
The possibility of accomplishing this is basic to 
further studies concerning the relative effectiveness 
of different types of counseling procedures. Studies 
of this nature will be necessary before it will ever 
be possible to make any scientific statements regard- 
ing such basic questions as the relative effectiveness 
of different counseling procedures with different types 
of counselee problems, or the relative effectiveness of 
a single over-all approach to all types of counseling 
problems as compared with diverse approaches based 
upon various diagnostic categories [p. 360]. 


Finally, Kiesler (1966) states in summarizing 
the outcome results of a 5-year study of indi- 
vidual psychotherapy with schizophrenics 
(Rogers et al., 1966): 


The picture emerging is that if we compare the 
therapy patients as a group to the control patients 
no differences emerge. Yet—and this seems to me to 
be the crucial question for outcome studies—if we 
divide the therapy cases by means of a theoretically 
relevant therapist variable (in this case level of 
“conditions” offered) the results are quite consistent 
with theoretical expectation. Hence, my final point 
would be that before we can validly assess the out- 
come or therapy evaluation problem, it is vitally nec- 
essary that we attempt to isolate therapist dimen- 
sions that will accurately reflect the heterogeneity of 
therapist performance. If we continue to evaluate 
therapy as a homogeneous phenomenon we will 
continue to obtain invalid results. 


In summary, it would seem quite essential 
and useful to bury these Uniformity Myths 
once and for all. Until our designs can incor- 
porate relevant patient variables and crucial 
therapist dimensions—so that one can assess 
which therapist behaviors are more effective 
with which type of patients—we will continue 
to perpetuate confusion. Psychotherapy re- 
search must come to grips with the need for 
factorial designs—as recommended a decade 
ago by Edwards and Cronbach (1952)— 
wherein different types of patients are as- 
signed to different types of therapists and/or 
therapy, so that one can begin to discover 
the parameters needed to fll in a meaningful 


paradigm for psychotherapy. 
As a postscript, it is often not remembered 


that different theoretical formulations and 
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techniques have at least originally been de- 
rived from patients of different types. As Stein 
(1961) observes: 


One source of difference between schools of psycho- 
therapy that is often overlooked and which needs to 
be made explicit is the difference in types of patients 
on which the founders of the different schools based 
their initial observations. Maskin summarizes this 
point rather well when he says: “Freud used hysteria 
as the model for his therapeutic method, depression as 
the basis for his later theoretical conjectures. Adler’s 
clinical demonstrations are rivalrous, immature char- 
acter types. Jung’s examples were constructed to a 
weary, worldly, successful, middle-aged group. Rank 
focussed upon the conflicted, frustrated, rebellious 
artist aspirant. Fromm’s model is the man in a white 
collar searching for his individuality. And Sullivan's 
example of choice is the young catatonic schizo- 
phrenic.” To this one might add that Rogers’ original 
formulations were based on college students [pp. 
6-7]. 


Assuming these theoreticians were all perspi- 
cacious and accurate in their observations, the 
historical fact that their different formulations 
were derived from experience with different 
classes of patients would seem to reinforce 
strongly the necessity and appropriateness for 
abandoning the Uniformity Assumption in 
psychotherapy research. 


The Spontaneous Remission Myth 


This second myth has been perpetuated 
primarily by Eysenck (1952, 1954, 1955a, 
1955b, 1961, 1964). Despite many refutations, 
it has continued to muddle research regarding 
the effectiveness of psychotherapy, and has 
fostered much of the pessimism that has more 
recently colored this research. Although this 
conception was restricted by Eysenck to psy- 
choneurosis alone, its implications seem to 
have generalized to most of psychotherapy. 
Its more specific statement takes the follow- 
ing form (Eysenck, 1961): “We may con- 
clude with some confidence that about two- 
thirds of severe psychoneurotics show recovery 
or considerable improvement without the bene- 
fit of systematic psychotherapy, after a lapse 
of two years from the time that their disorder 
is notified, or they are hospitalized [p. 711].” 
The clear implication of this proposition is 
that for psychotherapy to be proven worth- 
while, it has to demonstrate it can beat this 
two-thirds percentage, since two-thirds of the 
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patients improve without having anything 
done to or for them. 

This percentage represents a rather severe 
standard, as the evidence, such as it is, has 
reflected. Without this base rate for compari- 
son most therapists and laymen might be 
satisfied with a two out of three success rate. 
But with this base rate of spontaneous remis- 
sion psychotherapy needs to be almost totally 
successful. Apparently the assumption has 
taken a rather tight hold on both practitioners 
and researchers of psychotherapy. Yet, the 
surprising fact is that the entire evidence for 
this assumption comes from the findings of 
two studies, which are, at best, ambiguous. 
Further, the assumption contradicts clinical 
experience as well as some of the experimental 
findings regarding human and animal learning, 
and has been refuted in the literature on sev- 
eral occasions. Unfortunately, these refuta- 
tions focused on different aspects of the argu- 
ment, and were obscured by their connection 
with the effectiveness-of-therapy polemic. 
Hence, it is necessary to separate the spon- 
taneous remission argument from the latter 
polemic, and to integrate the various argu- 
ments against spontaneous remission. This sec- 
tion will, therefore, seriously reconsider this 
assumption with the hope that this refutation 
will bury it permanently. 

In the first place, clinical lore indicates that 
the phenomenon of spontaneous remission has 
been observed for only three diagnostic cate 
gories. The first category is acutely reactive 
schizophrenics, who typically experience an 
abrupt onset of psychosis under usually speci- 
fiable traumatic conditions, and whose pre 
morbid history is relatively free of gross pa 
thology. Lasting recovery is generally rapid 
for these schizophrenics regardless of treat- 
ment. The other two diagnostic groups in- 
clude the reactive and psychotic depressions. 
After temporary remission of their depressive 
symptoms these patients characteristically e% 
hibit a regular course of recovery, ordinarily 
for a period of about 2 years, after which the 
depression recurs. It would obviously be e 
sential in any studies evaluating therapy with 
any of these three groups that these remis- 
sion characteristics be considered. But, as fat 
as can be ascertained by this author, spol 
taneous remission as a typical phenomenon 
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bas not been clinically observed for other 
types of patients. In regard to psychoneurosis, 
moreover, Clinical tradition indicates quite 
dearly that, rather than spontaneous recovery, 
increased rigidity of symptoms tends to be the 
nile when the patient remains untreated. 
Freud was so impressed by the rigidity of the 
resistance encountered in the treatment of 
psychoneurosis that he coined the term repeti- 
tion compulsion to describe the process. Sec- 
ondly, no attempt has been made to explain 
the phenomenon in other than quite gross 
tems, If spontaneous remission of neurosis 
occurs, it must occur via some psychological 
and/or physiological process. What is the na- 
ture of this process? What is the stimulus 
which initiates the process of recovery? Are 
the stimulus and the process the same for all 
psychoneurotics, or different for various types? 
How does it come about that attitudes and 
habit systems on which one has acted for 
much of his life are modified so easily without 
rather energetic intervention of some sort? 
What makes an habitual maladaptive pattern 
of behavior suddenly begin to disappear? 
These are crucial questions that need to be 
considered regarding spontaneous remission. 

itdly, how can one reconcile spontaneous 
‘emission with the evidence in the area of 
karing regarding habit strength and par- 
ticularly the extreme difficulty of extinguish- 
mg avoidance responses? 

Since this phenomenon seems counter to 
Clinical experience, is only grossly explained, 
ad contradicts evidence from learning re- 
“arch, it would seem that the empirical evi- 
ence for its existence needs to be quite im- 
Dtessive indeed before its generality can be 
cepted, Instead, the entire argument for 
Spontaneous remission of neurotic patients 
mes from two survey studies cited by 
“Ysenck (Landis, 1937; Denker, 1947), whose 
"sults are interpreted to meet the needs of 
a ineffectiveness-of-therapy polemic. Let us 
examine these two studies critically to see if 
“Ysenck’s conclusion is justified. 
a approaching the problem of evaluating 
\ chotherapy, in 1952 Eysenck searched in 
a for a psychotherapy research study which 
ad included a control group in its design. 

$ was a legitimate search, since there is 
Ways the possibility in research that some 
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variable other than the defined treatment 
variable is responsible for the effects observed 
for the experimental subjects. To remove this 
possibility of confounding, one traditionally 
uses a group of control subjects. In the present 
case, if therapy patients change significantly 
more than controls, one can legitimately con- 
clude that some aspect of the treatment, 
ceteris paribus, is responsible for the differ- 
ences. 

But, as mentioned, Eysenck found it im- 
possible to find any such study in his 1952 
survey. Hence, as a substitute for the missing 
experimental control groups he looked for 
evaluative studies of untreated psychoneurotics 
(receiving no psychotherapy) where the pa- 
tients had been followed up over time to de- 
termine what, if any, improvement occurred 
“spontaneously” as the result of the “natural 
healing process.” Eysenck found two pub- 
lished studies which seemed to satisfy these 
criteria. He then abstracted and used the per- 
centage of cases who improved over time 
from these two untreated samples as a base 
line with which to compare the changes ob- 
served in the reported studies of psycho- 
therapy in the literature at that time. 

The first of these base-line studies was that 
of Landis (1937) who reported the ameliora- 
tion rate in state mental hospitals (in New 
York State as well as in the United States 
generally) for patients diagnosed under the 
heading of psychoneurosis. Because of the 
overcrowding of state hospitals and their 
chronic understaffing problem, it would seem 
extremely unlikely that these hospitalized 
neurotics received much, if any, therapy. 
Hence, any recovery observed for them could 
legitimately be considered as spontaneous. 
Landis reported that the percentage of pa- 
tients “discharged annually as recovered or 
improved” was 70% for New York State 
(during the years 1925-1934) and 68% for 
the United States as a whole (1926-1933). 
Eysenck (1961) concludes from these data: 
“By and large, we may thus say that of 
severe neurotics receiving in the main cus- 
todial care, and very little if any psycho- 
therapy, over two-thirds recovered or im- 
proved to a considerable extent.” Quoting 
Landis, he continues: “Although this is not, 
strictly speaking, a basic figure for ‘spontane- 
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ous’ recovery, still any therapeutic method 
must show an appreciably greater size than 
this to be seriously considered.” In other 
words, Eysenck seems to be saying that al- 
though this is not a basic figure for “spon- 
taneous” remission, we can still treat it as 
such. 

The second base-line estimate which Ey- 
senck offers comes from a study by Denker 
(1947). Denker’s report concerns 500 dis- 
ability claims taken from the files of the 
Equitable Life Assurance Society of the 
United States. These claims were made by 
persons who reportedly had been ill of a 
neurosis for at least 3 months before their 
claims were submitted. The claimants came 
from all parts of the country, had many dif- 
ferent occupations, and included all types of 
psychoneuroses. During their disability (de- 
fined as inability to carry on with any “oc- 
cupation for remuneration or profit”) these 
patients were regularly treated only by their 
local general practitioners “with sedatives, 
tonics, suggestion, and reassurance, but in no 
case was any attempt made at anything but 
this most superficial type of ‘psychotherapy’ 
which has always been the stock-in-trade of 
the general practitioner.” The disability bene- 
fits the patients received ranged from $10 to 
$250 monthly. Denker followed up these cases 
for at least a 5-year period after their illness, 
and often for as long as 10 years after the 
period of disability had begun. The criteria 
he used for “apparently cured” were, (a) 
complaint of no further, or very slight, diff- 
culties, and (6) successful social and economic 
adjustment by the patient. 

Eysenck (1961) reports: 


Using these criteria, which are very similar to those 
usually used by psychiatrists, Denker found that 
45% of the patients recovered after one year, an- 
other 27% after two years, making 72% in all. 
Another 10%, 5%, and 4% recovered during the 
third, fourth, and fifth years respectively, making a 
total of 90% recoveries after five years [pp. 710- 
711]. 


These are certainly very striking figures. 
Eysenck finally concludes: 


If we take a period of about two years for each base- 
line estimate, which appears to be a reasonable figure 
in view of the fact that psychotherapy does not 
usually last very much longer than two years and 
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may sometimes last less, we may conclude with some 
confidence that about two-thirds of severe neurotic 
show recovery or considerable improvement, with 
out the benefit of systematic psychotherapy, afters 
lapse of two years from the time that their disorder 
is notified, or they are hospitalized [p. 711]. 


Is this conclusion justified from Landis’ and 
Denker’s findings? Is Eysenck correct when le 
states that two-thirds of untreated psycho 
neurotics will, over a 2-year period, experience 
spontaneous remission of their neurotic il- 
nesses? Many individuals have questioned this 
conclusion, notably Rosenzweig (1954), a 
well as others (Cartwright, 1955; de Charms, 
Levy, & Wertheimer, 1954; Diihrssen & Jors- 
wieck, 1962; Luborsky, 1954; Stevenson, 
1959; Strupp, 1964a, 1964b). Let us examine 
in detail these counter arguments. 

Rosenzweig provides the most comprehen- 
sive and critical attack on the conclusion 
Eysenck draws from the Landis and Denker 
studies. His basic argument is that before 
these two studies can be considered as reptt 
senting a base line for recovery for untreated 
psychoneurotics (thereby functioning as eš- 
trapolated control groups for studies evaluat 
ing the effects of psychotherapy) the data of 
these studies must show three experimental 
characteristics: (a) the patients used in the 
Landis and Denker studies must be com- 
parable to those treated by psychotherapy— 
that is, the definition of psychoneurosis fot 
the patients in these studies must be the same 
as that for patients in psychotherapy, and the 
severity of the neurotic illness must be 
equivalent for the contrasted groups; (b) the 
Landis and Denker base-line groups must in 
fact have received no psychotherapy; other- 
wise the essential meaning of control group 
here is violated; and (c) the criteria for suc 
cessful outcome or improvement need to be 
equivalent, so that recovery or improvement 
means the same thing for the Landis and 
Denker patients as for typical psychotherapy 
patients. 

Rosenzweig then proceeds to argue that the 
Landis and Denker studies violate all three of 
these necessary conditions; therefore, Ey- 
senck’s conclusion of two-thirds spontaneous 
recovery is unwarranted. If Rosenzweig is 
correct, then the purported phenomenon of 
spontaneous recovery for psychoneurotic pa 
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teats is indeed a myth, since the Landis and 
Denker studies are the only evidence offered 
ir its existence. Let us look at Rosenzweig’s 
ad others’ arguments in detail as to why the 
wo studies do not meet the three essential 
wnditions for a psychotherapy control group. 

1. Are the Patient Groups Comparable? In 
tk first place, the Patient Uniformity Myth 
i operative in this comparison. It is quite 
ay, but incorrect, to assume that patients 
beled psychoneurotic are more alike than 
diferent, despite the fact that they are natu- 
ly selected in both the Landis and Denker 
studies. From an a priori basis alone the 
probability seems quite small that equivalent 
soups resulted from these several natural 
election processes. Rosenzweig further argues 


that 


The insurance disability cases were, as a whole, in 
dl likelihood less severely ill than any of the others. 
Denker himself points out that in these cases where 
Šsbility income was a factor the illness may have 
ten prolonged by this tangible secondary gain 
[zony]. By the same token the illness may very 
wel have been initiated, or at least partly instigated, 
conscious or unconscious prospects of such gains. 
T compare psychoneuroses of long standing, dating 
= many instances from early childhood (the typical 
ae treated by psychoanalysis), with such disability 
æuroses is highly dubious, and the fact that the 
would have cleared up quickly after brief 
treatment by a general practitioner is thus not sur- 


ising [p. 300]. 


Cartwright (1955) further argues against the 


Pychoneurotic status of Denker’s insurance 
Patients: 


Deniker’s Study was published in 1946, and all cases 
toe followed-up for at least five years after re- 
“ty. If it is assumed that Denker’s research took 
me year to catry out, then, since some cases were 
led for five years and others for only one, all 
Pad of neurosis had their onset between 1934 
i ee 1933 the economic depression was at 
a an the United States. From that time on, 
Wea. economy tended to improve except for 
tial relapse around 1937-38. . . . It is evident 
debi Period (which overlapped the period of 
O of Denker’s subjects) was one of general 
e a condition of severe unemployment to 
ited on of plentiful employment throughout the 
im ves These data (ie., employment rates 
as to 1944) suggest that it is reasonable to 
Proportion of the variance of Denker’s 
e, be accounted for in terms of national 
l tom economic depression rather than a 
Tecovery from neurosis [p. 292]. 
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And Luborsky (1954) speculates still further 
about the lack of comparability of the Denker 
patients to psychotherapy patients: 

Many of the “insurance” group would probably 
never have visited the doctor if it were not required. 
As a whole the group is probably of higher social 
and economic level than other groups (apparently 
since they were able to carry disability insurance in 
the first place). Very likely the choice of a general 
practitioner rather than a psychiatrist to treat their 
psychoneurosis reflects a not-to-be-ignored difference 
in an attitude to their illness [p. 129], 


or as Cartwright has just argued, reflects the 
scarcity and relative expense of psychiatrists 
in depression years. 

Regarding the lack of comparability of the 
patients in the Landis study, Rosenzweig 
makes the following comments: 


Here one could reasonably expect that the neuroses 
must have been extraordinarily severe in order for 
these patients to have become eligible for admis- 
sion to these crowded institutions. In these in- 
stances the outcome of treatment would be ex- 
pected to be far less favorable than for either the 
Denker control group or the experimental groups 
[p. 300]. 


This of course argues for less spontaneous 
recovery for Landis’ patients, which is incon- 
sistent with the percentages reported, at least 
for the questionable criterion of recovery that 
Landis used. 

Regarding both the Denker and Landis pa- 
tient groups, Rosenzweig summarizes: 


It may be concluded that, in general, the Denker 
base-line group was probably less seriously ill, the 
Landis control group more seriously ill” [than the 
patients who typically are seen in psychotherapy]. 
To the degree that this conclusion is sound it may 
be further inferred that the control and experimental 
groups fail to meet an essential criterion of com- 
parability—illness severity [p. 300]. 


It seems quite clear from the above rebut- 
tals how one can get into inextricable in- 
terpretive difficulties, by operating on a mis- 
conception as unfounded as the Patient Uni- 
formity Assumption, (for those cases where 
patients are naturally selected for various 
studies and where one attempts to compare 
results). It seems quite obvious that the above 
itemizations represent serious patient con- 
foundings—possible secondary gain, a con- 
comitantly improving economic milieu, and 
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social-class contamination of the “psychoneu- 
rotic” patients in Denker’s study and the more 
severely ill Landis patients—and indicate at 
the very least feasible alternatives to Ey- 
senck’s claim of comparability of the “control” 
patients to those usually seen in psycho- 
therapy. Indeed, in view of these confounding 
factors, the probability that the groups are 
comparable seems quite low; and hence the 
use of the Denker and Landis patients as 
control groups for base-line comparisons with 
psychotherapy seems invalid. One can ap- 
proximate comparability of groups only by 
random selection and random assignment of 
patients to treatments; or by careful match- 
ing of experimental and control patients on 
relevant variables; or by obtaining post facto 
measures of relevant patient characteristics, 
statistically controlling for their influence. 
These procedures represent the reasonable 
alternatives to the naive selection dictated by 
the Patient Uniformity Myth, as well as the 
recommended designs for any future studies 
attempting to arrive at a base line of “spon- 
taneous remission” for psychoneurotic pa- 
tients. The incomparability of the control 
groups vitiates the case for spontaneous re- 
mission of psychoneurotic disorders based on 
the Denker and Landis studies. 

2. Did the Two Groups of Patients Ac- 
tually Receive No Psychotherapy? Let us 
look first at the Denker group, and again 
quote Rosenzweig (1954): 


In Eysenck’s words these patients were “regularly 
seen and treated by their own physicians with seda- 
tives, tonics, suggestion, and reassurance.” . . . These 
various presumably nonpsychotherapeutic techniques 
mentioned include suggestion and reassurance—well- 
known methods of psychotherapy; and psychiatrists 
regularly use sedatives and tonics as adjuncts to their 
practice. .. . The only difference between the work 
of the general practitioner and of the eclectic psy- 
chiatrist that could be assumed, in the absence of 
detailed and specific knowedge, would be a difference 
in thoroughness or expertness, not a difference in 
kind [p. 300]. 


In other words, from the Denker data, le- 
gitimate comparison could be made between 
psychotherapy of different levels of expertness, 
with the prediction being that the more ex- 
pert therapy would produce greater improve- 
ment than that of the general practitioner. 
But the crucial point is that the Denker 
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group cannot properly be considered a co 
trol group for spontaneous recovery, singe 
the patients admittedly received some of tk 
elements of psychotherapy. 

Similar doubt is cast upon Landis’ 
Again to quote Rosenzweig: 


To maintain that neurotic patients admitted to sat 
hospitals receive no psychotherapy is seriously opa 
to doubt. These institutions, despite their notorios 
shortage of staff, usually make a special effort W 
treat their neurotic admissions, because these cas 
have a better prognosis, and because they are i’ 
more accessible to treatment [p. 301]. 


De Charms, Levy, and Wertheimer (1954) 
add: “Some of Landis’ group did receive psy- 
chotherapy;” and suggest a further contami- 
nation in that “hospital confinement ani 
treatment may themselves be therapeutic” 
Luborsky (1954) elaborates further on 
same point: 


Also as Landis points out (in objecting himself © 
the use of the consolidated amelioration rate as 
base-line for “spontaneous” recovery) neurotics ii 
state hospitals are given a variety of treatments, it! 
cluding some psychotherapy. And, as they are ree 
tively unusual occupants of state hospitals, the 
probably get unusual treatment [p. 131]. i 


This point can be underscored further by 
adding that, since psychiatrists in state hos 
pitals are very likely human, it would not b 
too unbelievable that they might seek alt 
and perhaps enjoy a little, some contact wilt 
a patient who was not divorced from reality 
who could converse reasonably well, wh! 
presented some hope of recovery, and whos 
treatment-of-choice could appropriately bt 
traditional psychotherapy. 

One again is compelled by these argument 
to agree with Rosenzweig (1954): 


It must then be concluded that the control sih 
groups cited by Eysenck do not sharply differ fto 
the experimental groups in respect to the importat 
variable of having received psychotherapy. As be 
with regard to illness severity, the necessary 0 
trast between the base line and the experiment 
groups becomes markedly attenuated [p. 301]. 


Further, since the spontaneous recovery phe 
nomenon by definition requires that conti 
patients not be treated, the violation of ths 
essential condition by the Landis and Denkt 
studies by itself negates their value ® 
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evidence for the Spontaneous Remission 
Assumption. 

3. Are the Criteria for Improvement or 
Recovery Used in the Denker and Landis 
Studies Comparable to Those Used to Evalu- 
we Traditional Psychotherapy? Can the de- 
gee of improvement or recovery reported in 
these two studies be regarded as equivalent to 
that reported for traditional psychotherapy? 
In the first place, it is important to note that 
tems like improvement or recovery are at 
test ambiguous. As Luborsky (1954) states: 


The terms say nothing about what the patient was 
Ete at the beginning and end of treatment; they 
an be and are applied to patients at the entire 
nage of mental health. A schizophrenic patient can 
te called “recovered”; so can a patient with a 
‘ight personality problem. Obviously the word 
a is used differently in each case [p. 
10). 


The criterion for improvement or recovery 
Îr Landis’ state hospital patients was “favor- 
ale discharge” from the hospital. Rosenzweig 
(1954) reasons that the probability is quite 
bw that the criteria used to come to a 
ayorable discharge decision for hospital 
patients are the same as those used for 
‘emination of therapy outpatients. 


hn other words, while patients residentially treated 
ae generally considered in terms of hospital dis- 

and return to the community, the criterion 
of social recovery being highly relevant, patients 
onresidentially treated, as by psychoanalysis, live 
‘ontinuously in the community and are worked with 
"tems of radical therapy which, if successful, 
Pemits them to live not only with others but with 
themselves, This difference in therapeutic goal is so 
great that percentage figures for residential and non- 


Peres treatment are dubiously commensurable 


It could be added, along similar lines, that 
it is not too unreasonable to assume that 
m many hospitals, especially for voluntary, 
‘neommitted psychoneurotic patients, fac- 
tig other than personality condition—such 
3 daily patient quotas which determine the 
pital budget, pressure from relatives, pres- 
‘tte from the patient himself, etc.—often 
Come to bear on the decision to discharge a 
Particular patient. 
i criteria of recovery utilized by Denker 
* admittedly far superior to Landis’ dis- 
& tate. Recall that Denker used two 
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basic indices: (a) complaint of no further, 
or very slight, difficulties, and (6) successful 
social and economic adjustments by the pa- 
tient. Further, he followed up these patients 
for a 5 to 10 year period—a procedure that 
would have certainly strengthened Landis’ 
outcome data. This seems to represent a 
careful and sophisticated attempt to evalu- 
ate the recovery of his insurance patients. 
However, Cartwright (1955) asks: 

It is of some interest to speculate about what evi- 
dences were available in the files of the insurance 
company concerning successful social adjustments 
made by persons whose disability benefits had been 
terminated. Such termination must certainly be taken 
as evidence for the making of successful economic 
adjustments. But “complaint of no further, or very 
slight, difficulties’ may represent little more than 
no further supportable claims against the company 
[p. 291]. 


In other words, what motive would make an 
insurance company collect careful and de- 
tailed records of social adjustment of patients 
after they had withdrawn their claims. If 
subjective report of the patients was given 
heaviest weight in these indices, as seems 
likely, then this report seems especially sus- 
ceptible to the “‘hello-goodbye” effect (Hatha- 
way, 1948), particularly if one recalls the 
above argument regarding secondary gain 
(money) for these patients. 

These considerations compel one to agree 
with Rosenzweig, that “the standards of im- 
provement and recovery in Eysenck’s various 
patient groups, control and experimental, bear 
so little resemblance to each other that, once 
again, the basis of his comparisons has little 
demonstrable validity.” Since the criteria of 
recovery for the Landis and Denker groups 
seem quite divergent from those used for 
the evaluation of psychotherapy, the viola- 
tion of this essential condition in the Landis 
study, and likely the violation in the Denker 
study as well, further destroys their utility 
as evidence for the Spontaneous Remission 
Assumption. 

In summary, the discussion reported seems 
to lead unequivocally to the conclusion that 
there is no evidence for spontaneous remis- 
sion of psychoneurosis. Hence, the belief 
seems to be nothing more than a myth propa- 
gated by a popularized and naive interpreta- 
tion of two research studies. The patients 
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used in the Landis and Denker studies and 
the percentages of recovery reported by these 
authors in no way can be considered evidence 
of spontaneous remission for untreated psy- 
choneurotic patients. Consequently, Eysenck’s 
use of these percentages as a base line of 
spontaneous recovery against which to com- 
pare the efforts of psychotherapy is invalid. 
The discussion above has shown that the con- 
trol patients were very likely not comparable, 
in fact did receive some treatment (psycho- 
therapy) and hence are not controls, and their 
recovery was very likely evaluated on sig- 
nificantly different criteria. As Cartwright 
(1955) concludes: 


It is a regrettable accident that the question con- 
cerning the effectiveness of psychotherapy has been 
tied up with the question about spontaneous remis- 
sion. It has been assumed that the question about 
therapy is dependent for its answer upon the answer 
to the question about spontaneous remission. The 
regrettable part of this is that the worse assumption 
has been made that the answer to the spontaneous 
remission question is already known. Of course, it 
is said, people do recover spontaneously from neu- 
rosis and other psychopathological states. Do they? 
How many? How quickly? Certainly there is no 
reliable evidence in the studies of Landis and Denker. 
Indeed, the general absence of such evidence leaves 
it possible to conclude that the statement asserting 
the existence of spontaneous remission phenomena in 
regard to neursosis is made on a priori grounds, 
rooted perhaps in loose analogy with the natural 
histories of coughs and colds. It seems to be an 
open question of fact as to whether or not there 
are spontaneous remission phenomena at all, and if 
so, what statistical characteristics they possess [pp. 
294-295]. 


It should be pointed out that the spontane- 
ous recovery rates reported for “psychoneu- 
rosis” are far from being reliable. Various 
survey studies do not agree with the two- 
thirds rate that Eysenck presents. As de 
Charms, Levy, and Wertheimer (1954) 
observe: 


Eysenck (1952) also states that these results are 
typical and that they are “remarkably stable from 
one investigation to another.” This statement is 
questionable in view of the reports of five year 
follow-ups such as (a) that of Friess & Nelson 
(1942) where one may interpret the results... . 
to mean that 20% is the spontaneous remission rate, 
and (b) that of Denker (1946) where 90% is 
reported as the spontaneous remission rate for a 
five year follow-up. If these two studies differ so 
widely, it appears that existing figures for spontane- 
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ous remission rates are not at all consistent. 
Eysenck used a two year base, we see no remi 
why a five year base may not be taken in . 
paring two studies, especially since we found 
other studies utilizing a two year follow-up 

which to check Eysenck’s claim of stability (j 
234-235]. i 


It can be added that two more recent follo 
up studies report rates which are also quit 
different from Eysenck’s two-thirds percentag 
(Hastings, 1958; Saslow & Peters, 1956). 

Finally, it is important to emphasize tht 
it would be quite a useful contribution f 
valid developmental data could be obtaind 
for emotionally disturbed individuals. Ba 
the approach must be more sophisticated the 
those of Landis and Denker. One canu 
operate on the Patient Uniformity Myth al 
report spontaneous remission rates for “ps 
choneurosis.” Rather an attempt first must b 
made to develop reliable operations by whid 
one can distinguish different types of psyùt 
neurotics. Several more recent survey studi 
of remission have attempted this kind 
differentiation (Hastings, 1958; Saslow l 
Peters, 1956), but unfortunately used ti: 
ditional psychiatric nosologies (hysteri 
obsessive-compulsives, etc.) which have be 
shown to be unreliable classifications (A 
hoff, 1954; Ash, 1949; Dayton, 1940; Dë 
ring & Raymond, 1934; Mehlman, 19! 
Schmidt & Fonda, 1956; Wilson & Demis 
1927). If reliable measures can be develo 
which meaningfully differentiate psychont! 
rotic patients, then ideally one could obti 
developmental data covering the entire lt 
span for these respective groups. That is) 
would be useful not only to have data chat 
ing the course of an untreated disorder alt 
it has become a debilitating problem, but a 
to obtain data reflecting the prior devel 
ment of the disorder, With data of this kit 
one could not only more validly assess Ü 
effects of specific therapeutic interventio 
but could also be able to predict which in 
viduals will subsequently experience whi 
kinds of disorders. 


The Myth That Present Theories proi 
Adequate Research Paradigms 


Most of the basic deficiencies in py” 
therapy research have derived from the 


tempt to apply relatively unsophisticated 
theoretical formulations. This section will at- 
tempt to demonstrate how deficient our pres- 
ent formulations are, as well as the necessity 
for basic revision before further research 
progress can occur. 

Three theoretical formulations have domi- 
mated the research approaches to psycho- 
therapy: Freud’s theory of psychopathology 
and psychotherapy, Rogers’ earlier and later 
notions regarding the essential conditions of 
the therapeutic relationship, and “learning 
theory” via the behavior therapy approaches. 
let us examine the adequacy of these formu- 
lations as guides for research in psycho- 
therapy. It is the thesis of this section that 
actitical look at these systems reveals serious 
inadequacies: basically that the theories are 
not comprehensive (i.e., do not exhaust the 
domain of variables operative in the therapy 
interaction and do not incorporate existing 
empirical data); and secondly, do not meet 
the requirements for an adequate paradigm 
or model for psychotherapy research. Let us 
look at these closely related difficulties. 

In order for a theoretical formulation to be 
adequate and useful it must first be compre- 
hensive, covering in its explanations the 
known facts and variables in its empirical 
domain. It seems rather apparent that none 
of the three traditional formulations meet this 
basic prerequisite. The basic difficulty seems 
to be that none of them has explicitly built 
mto its system propositions covering indi- 
vidual differences regarding either the patient 
ot the therapist, This oversight seems to be 
the direct result of the Uniformity Myths. 
In other words, until quite recently most 
theoreticians and researchers have implicitly 
agreed that there is one ideal form of therapy 
for all patients, and the function of research 
'S to tell us which of the above three best 
approximate this ideal. Yet, as argued above, 
clinical experience and recent empirical data 
Point to the centrality of both patient and 
therapist individual differences in the out- 
comes of psychotherapy. One searches present 
formulations in vain for propositions incor- 
Porating these patient and therapist variables. 
t seems, therefore, that one is compelled to 
dide that the present theories are either 
% specific (behavior therapy) or too general 
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(analytical or Rogerian) in explaining the 
known facts about psychotherapy. In either 
case, these formulations are not comprehen- 
sive, and hence basically inadequate. 

Another way of underlining their theoreti- 
cal inadequacy is to state that the present 
theories do not offer an adequate paradigm 
or model for psychotherapy research. A re- 
search paradigm optimally should perform 
several useful functions. Foremost, it should 
focus on the apparent independent variables 
in a domain of research, as well as on the 
scaling and methodological problems encoun- 
tered in attempts at operationalizing and ma- 
nipulating these variables. Secondly, it should 
provide a similar emphasis regarding the 
dependent variables. Finally, it should deal 
with the problem of implicit and explicit 
confounding variables which potentially inter- 
act and covary with the dependent variables. 
The final product would be an explicit listing 
and elucidation of the total matrix of vari- 
ables potentially and actually operative in the 
behaviors being studied in a particular 
research area, 

Each of the three theoretical positions has 
failed to specify exactly what the independent 
or dependent variables are. None has method- 
ically dealt with the problems of the quality 
or quantity of outcome expected from the 
respective therapeutic interventions, or of 
differential outcome for different kinds of pa- 
tients, None has dealt with the sampling and 
other methodological considerations which, 
remaining unspecified, make it impossible to 
design a test of present constructs. Let us 
examine each of these “theories” of therapy 
in order to demonstrate more clearly this 
quite ambiguous state of affairs. 

Freudian Therapy. What is the heart of 
analytical therapy? What is the attitude, 
technique, or personality characteristic essen- 
tial before a therapist can be said to be doing 
analytical therapy? The answer seems quite 
unclear and at best multifaceted and inex- 
plicit. It seems to be an attitude: the thera- 
pist must present himself as a neutral and 
ambiguous stimulus to the patient, in order 
not to distort the patient’s task of free- 
association and dream production or hamper 
the appearance of transference phenomena. 
Yet, at subsequent stages of the interaction 
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the analyst apparently becomes quite non- 
ambiguous, offering interpretations of child- 
hood experiences and the transference rela- 
tionship. Where does the one attitude (or 
complex of attitudes) end, and the other 
begin? What are the behavioral cues which 
determine the shift of set? What is the 
interrelationship of the various attitudes 
(inaction, ambiguity and/or neutrality) pre- 
vailing in the earlier stages of therapy? How 
do these attitudes relate to the interviewing 
techniques of questioning and clarification 
which seem to prevail concurrently? At the 
later stages of therapy: Does interpretation 
of childhood experiences need to precede in- 
terpretations of the transference relationship? 
Does positive transference necessarily precede 
negative transference in each therapy inter- 
action? If not, what are the variables deter- 
mining the sequence? Does the therapist only 
interpret at moderate depth, that is, just 
beyond the preconscious? What are the guides 
for the frequency at which he interprets? 
What are the cues for determining the 
optimal timing of a given interpretation? Do 
interpretations need to be correct in order to 
be facilitative of the therapeutic task? What 
are the cues by which the therapist deter- 
mines whether a given interpretation has ac- 
complished its purpose? Does one interpret 
differently (at different depths, frequencies, 
or with different timing) for different kinds 
of patients? Further, the therapist’s person- 
ality seems to be crucial at all stages, in that 
the analyst himself should undergo therapy. 
But, how does one evaluate (what are the 
criteria?) whether an analyst has been “suc- 
cessfully” analyzed? What are the personality 
characteristics of the ideal analyst? To what 
extent can analytic therapy be accomplished 
by an unanalyzed or partially analyzed thera- 
pist? How do the personality characteristics 
relate to the therapist attitudes and tech- 
niques? Or are all these aspects (person- 
ality, situational, technique, etc.) essential 
ingredients of Freudian technique? 

Stone (1951) argues that all these ingredi- 
ents are essential: 


It is not enough to say that psychoanalysis recognizes 
resistance and transference; psychoanalysis has other 
technical precepts which, besides the “basic rule” 
include the exclusive reliance upon free association, 
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regularity of time, frequency and duration, 
to five hours per week, use of the recumbent pof 
tion, the confinement of the analyst’s activities §) 
interpretation, the maintenance by the analyst of 
attitude of emotional passivity and neutrality 
accordance with which he offers no gratification 
the patient’s transference wishes, abstention by & 
analyst from giving advice or participating in í 
daily life of the patient, absence of immedi 
emphasis upon the curing of symptoms [pp. 2 
218]. p 


It is Stone’s belief that changes in any o 
of these features of psychoanalysis might wl 
affect the dynamics of the transference, asl 
hence the whole course of a treatment. 

When we search for the Freudian dependet 
variable we encounter similar ambiguities. I 
insight (or making the unconscious conscious] 
the crucial in-therapy product of the co 
pattern of therapist activity in the therapy 
hour? If so, what are the cues by whid 
insight is evaluated? Is insight regarding ont! 
childhood behavior, or, regarding the transie 
ence relationship, sufficient? Or is “workin 
through” essential? How does the therapi 
evaluate whether working through has 
accomplished? At what level or degree ti 
working through does the therapist begin U 
talk about termination: When has thet 
been enough working through? Can workin 
through be accomplished without the pri 
accomplishment of insight? Or what kind 
insight and what level is required (what cis 
are used to evaluate these factors?) betot 
working through can be effective? Regardiit 
extra-therapy criteria of successful outco 
What specific patient characteristics or b 
haviors are implied by a concept of a totali 
reintegrated or rebuilt personality? How dt 
the successful patient behave towards othe! 
people? What criteria does he employ # 
evaluating his own goal-seeking behavio 
What attitudes does he hold toward hi 
toward his family, toward people in gen 
Can he be unsuccessful at a particular jo! 
Can he be single? Will he be involved " 
social organizations or civic affairs? Do tl 
ferent patterns of these specific extra-thetdl! 
criteria of success emerge for different P* 
tients? If so, which ones with which win 
of patients? And what are the cues by whid 
one distinguishes the different kinds ” 
patients? 
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To the author’s knowledge no one has dealt 
with these theoretical questions methodically 
and exhaustively—neither Freud nor his fol- 
lowers? What is the independent variable— 
the crucial therapist behavior (attitude, tech- 
sique, personality characteric, or what have 
you) which brings about patient change 
in the therapy relationship? What is the 
cucial in-therapy patient change that occurs 
as the result of the therapist’s behavior? 
How and in what manner does this in-therapy 
change mediate change in behavior outside 
the therapy hour? Since there are no theo- 
retical answers to these questions each re- 
archer attacks the variables most interest- 
ing to himself (e.g., interpretation, therapist 
ambiguity, therapist general activity, counter- 
transference, insight, resistance, anxiety, and 
the like), Although Freudian theory seems 
to generate many constructs, the explicit 
integration or description of the essential 
ingredients nowhere occurs. 

Likewise, many methodological questions 
arise when one begins to investigate specific 
Variables. Since the theory is not explicitly 
tlaborated, one has little theoretical guidance 


*Fenichel made significant beginnings in the direc- 
tion of systematizing the analytic theory of therapy 
(Renichel, 1941, 1953, 1954) but his death interrupted 
the endeavor, 

Recently, some of these factors have also been 
considered by analytic investigators. Levy (1961), 
teporting a large-scale research project under the 
directorship of Franz Alexander, recently described 
‘ome of the conceptual difficulty their research has 
‘neountered: “Another important element of the 
therapeutic process we are investigating is the role 
of insight in causing changes in the patient’s atti- 
tudes. |.. It is difficult, however, to be certain 
about both the qualitative and quantitative aspects 
of this feature of the therapeutic experience. Several 
Mestions need to be investigated. To what extent 

Cognitive—intellectual processes and/or emo- 
onal experiencing with or without awareness oper- 
pe What are their relative importance? How do 
ty vary from case to case? To what extent is the 

apist-patient relationship important primarily to 

Provide support and relieve anxiety so that the 
it can acquire insight? What are the relative 
we of insight on the one hand and the living 

"personal experience with a new (substitute) 
Parent on the other? To what extent is the learning 
on unconscious, and how much does conscious 
in on enter into it? How important is insight 
“a Maer factors, ie, the re-experiencing of and 
in wi nding of the original childhood experiences 

ch the neurotic patterns developed? [p. 1291.” 
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in resolving these problems. For example, 
how does one decide where in the therapy 
sequence to study his particular variable (the 
problem of unit or segment location)? Is the 
variable operative equally in the early, middle, 
and later stages of therapy? Is the variable 
limited to a specific content of the patient, or 
a particular patient-therapist interaction in 
therapy? Does one need to study the entire 
interactive or content sequence, or can one 
sample the sequence? If one can sample, what 
size sample is necessary for validly reflecting 
the dimension in which one is interested? 
(What is a valid unit or segment size?) Does 
location in the interview hour by itself bias 
the kind of sample one obtains? Can a par- 
ticular dimension be rated validly from the 
therapist’s (or patient’s) verbalizations alone, 
or does one need the preceding and succeeding 
comments of the other participant? (Is con- 
text necessary?) What level of clinical so- 
phistication is needed to rate the dimension 
validly? What are the independent, extra- 
therapy criteria for the in-therapy measure? 
Does the patient’s report of what the therapist 
is doing need to be congruent with the inde- 
pendent measure obtained? 

Rogerian Therapy. Roger’s theoretical state- 
ment appears in several places (Rogers, 1957, 
1959a; Rogers et al., 1966) but perhaps most 
succinctly in his “Necessary and Sufficient 
Conditions” paper (1957). In this paper he 
specifies clearly the three therapist attitudes 
which must be communicated to the patient 
before constructive personality change can 
occur for that patient. The three attitudes or 
conditions are unconditional positive regard, 
empathic understanding, and congruence. Ac- 
cording to Rogers, if the therapist communi- 
cates these attitudes to the patient, construc- 
tive personality change will occur. Moreover, 
he specifies the process dimension along which 
this patient change occurs, describing the 
seven “strands” of this process and the de- 
scription of the subsequent levels of each 
strand (Rogers, 1959b; Rogers, Walker, & 
Rablen, 1960). At first glance the theoretical 
statement seems quite simple and easily verifi- 
able. Therapist technique and personality 
characteristics are not crucial. Rather, the 
crucial therapist behavior is the communica- 
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tion of the three conditions to the patient. 
Hence the independent variable seems clear- 
cut: it is multidimensional, consisting of three 
therapist attitudes. Likewise, the in-therapy 
dependent variable is clearly delineated, as 
presented by the above-mentioned patient 
“process” construct. However, a closer ex- 
amination of Rogers’ theory from the same 
points of reference as for the Freudian frame- 
work reveals that the specificity of Rogerian 
theory leaves much to be desired. 

What level of these therapist attitudes is 
necessary before constructive personality 
change begins to occur? Are they instead all- 
or-none phenomena? (The various operational 
definitions of therapist conditions to date all 
take the form of a continuum). What is the 
interrelationship among the three conditions? 
Ts it necessary for all three conditions to be at 
an equally high level? If so, what are the 
respective levels? Will a low level of one 
condition cancel the effectiveness of other 
high condition levels? Is one of the condi- 
tions a precondition? (Rogers seems to sug- 
gest that congruence may operate in this 
fashion.) Need the level of conditions be high 
at every stage of the interaction, or is it suf- 
ficient that they average at a high level for 
the entire interaction? How does one weight 
the conditions when combining them for sta- 
tistical analyses? Are the conditions related to 
therapist personality characteristics? Is the 
patient’s view of therapist conditions the 
crucial and only measure required? How will 
therapists’ and independent observers’ views 
agree or disagree with the patient’s viewpoint, 
and is agreement or disagreement necessary? 
How is the operation of the separate conditions 
balanced by the others? (Is the appropriate- 
ness of congruence evaluated by empathic 
understanding?) Is the optimal patterning of 
conditions different for different patient groups 
or types? 

Regarding Rogers’ patient-process formula- 
tion, the dependent variable in his model: 
How are the seven strands interrelated? Does 
one need to utilize all seven strands in order 
to validly reflect constructive personality 
change? What function does process change 
take over the therapy interaction? Is it 
monotonic? negatively accelerated? U-shaped? 
or does it represent some other pattern? If it 
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is not monotonic, what explains the points d 
acceleration or deceleration of the patient} 
process? Where over the sequence of psyche 
therapy does an investigator sample in 
to validly reflect the process occurring? 
size of sample is necessary, and from 
location in the therapy hour, in order l 
validly represent the process occurring in a 
individual therapy hour? (Rogers’ therapis 
and patient dimensions are theoretically & 
scribed as content-free; hence that issue isa 
least resolved.) Can one measure condition 
or process from the individual participant) 
verbalizations alone, or is it necessary that om 
have context, that is, the other’s preceding 
and subsequent verbalizations? (E.g., dos 
one need the patient’s discourse subsequet 
to the therapist’s statements in order to mei 
ure empathic understanding?) How and it 
what manner is in-therapy patient process tè 
lated to extra-therapy criteria of successiil 
outcome? How does a deeper level of Exper 
encing relate to a patient’s attitudes toward 
his family and others, his performance on hë 
job, his remission of symptoms, etc.? 

Kiesler, Klein, & Mathieu (1966) conclute 
from the findings of a 5-year study of Re 
gerian therapy with a hospitalized schiat 
phrenic population: 


In the future studies addressed to issues emanalilt 
from Rogerian theory will require more detaili 
definition and elaboration of both conditions 

process factors, as well as their conceptual integtt 
tion with other aspects of the therapy setting (oe 
cluding patient and therapist characteristics, inte 
actional factors, and empirical phenomena), Sue 
factors will then require more rigorous experimenti 
control. Before this is possible, however, more & 
tensive methodological research is necessary in 0! 

to resolve the many issues presented by the compl! 
Process phenomena. When, as in this study, the 
retically central variables proved to be imbedded ii 
a more complex framework, exploratory studies art 
necessary to evaluate which of the many theoreti: 
cally extraneous factors in the setting require pi 
ticular consideration and control. Only with such 
pilot information, and with validly anchored insttt 
ments for the assessment of therapy variables, ®' 
more definitive experimental studies be undertake? 


Behavior Therapy. One expects the pait 
digm here to be quite sophisticated inasmuch 
as it is based on behavior theory. It is rathet 
unexpected to find rather that the applicati 
of learning theory to the psychotherapy intet 
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action has not led to a single behavior ther- 
apy, but rather to several kinds of behavior 
therapy: aversion therapy, negative practice, 
operant conditioning, reinforcement with- 
drawal, and desensitization (Grossberg, 1964). 
This multiplicity of products of itself makes 
one question the relevance of the original 
kaming theory to the subsequent applica- 
tions, As Colby (1964) ponders: 


It is often difficult to see how specific therapeutic 
techniques are deducible from the theory. Learning 
theory and behavioristic approaches pride themselves 
oa their grounding in scientific principles. Yet differ- 
eat learning theorists derive different techniques from 
the same theory, and utilize similar techniques from 
different theories. The contradictory disagreement 
within the paradigm is obvious to all, but each 
Ki for the exclusive truth status of his position 
p. 362]. 


Still, at first sight, the various derivations do 
seem to represent greater clarity and simplic- 
ity than either the Freudian or Rogerian 
models, However, closer inspection reveals 
several inadequacies in these derivations. 

Are the various techniques equally effective 
in the therapeutic situations where behavior 
therapies are traditionally applied? If they 
ate, which are more economical? If they are 
not, what is the ordering of the techniques 
vis-a-vis effectiveness? How are the respective 
symptom removals or behavior reinforce- 
ments related to independent, extra-therapy 
indices of improvement or success (such as 
ability to get along with others, performance 
on the job, interactions with one’s family and 
ftiends)? Is the particular behavior therapy 
technique (the independent variable) a uni- 
‘mensional manipulation? Or are there other 
implicit facets of the technique besides the 
theoretically described modification? With 
desensitization, for example, how much of 
Symptom removal might be the result of sug- 
gestion? Is it possible that the removal of 
Symptoms occurs not from “desensitization” 
but tather because the patient has learned to 
discriminate his anxieties more clearly as the 
sult of being asked to construct an anxiety 
hierarchy? Is it possible that by simply teach- 
mg the patient to relax, this relaxation abil- 
lly generalizes to all other situations, includ- 
ng the one in which the phobia is apparent? 
Are the therapist’s attitudes or other person- 
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ality characteristics influencing the effective- 
ness of the behavior-therapy techniques? If 
so, how do these factors interact with the be- 
havior technique for which patients? Given 
that changes in patient attitudes and feelings 
about himself and others are theoretically 
irrelevant (symptom removal is the only 
crucial consideration), do the various behav- 
ior techniques nevertheless affect these atti- 
tudes and feelings, and in what manner? Do 
the constructs used to define the technique 
actually lead to standardized and replicable 
operational procedures for different therapists 
applying the same technique? * 

In an excellent critical review of the be- 
havior therapy literature, Breger and Mc- 
Gaugh (1965) conclude: 


It is our opinion that the current arguments sup- 
porting a learning-theory approach to psychotherapy 
and neurosis are deficient on a number of grounds. 
First, we question whether the broad claims they 
make rest on a foundation of accurate and complete 
description of the basic data of neurosis and psycho- 
therapy. The process of selecting among the data 
for those examples fitting the theory and techniques 
while ignoring a large amount of relevant data 
seriously undermines the strength and generality of 
their position. Second, claims for the efficacy of 
methods should be based on adequately controlled 
and adequately described evidence. And, finally, when 
overall claims for the superiority of behavioral 
therapies are based on alleged similarity to labora- 
tory experiments and alleged derivation from “well 
established laws of learning” the relevance of the 
laboratory experimental findings for psychotherapy 
data should be justified and the laws of learning 
should be shown to be both relevant and valid [p. 
339]. 


In summary, the basic deficiencies in pre- 
vailing theoretical formulations are that they 
perpetuate and do not attack the Uniformity 
Myths described in the previous section; do 
not explicitly deal with the problem of con- 
founding variables; and do not specify the 
network of independent, dependent, and con- 
founding variables in sufficient enough detail 
to permit researchers to solve sampling and 
other methodological problems. In view of 
these considerations, it seems evident that 
our formulations about psychotherapy con- 


2“The ‘imagination of a scene’ is hardly an ob- 
jectively defined stimulus, nor is something as gen- 
eral as ‘relaxation’ a specifiable or clearly observable 
response [Breger & McGaugh, 19651,” 
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tain serious inadequacies. Until our present 
theories are brought up to date by being made 
more comprehensive and by spelling out in 
much detail the variables of the theoretical 
paradigm—or until new formulations are in- 
troduced which meet the same requirements 
—it seems that psychotherapy investigators 
must continue to make arbitrary decisions re- 
garding these parameters, or attempt to fill in 
the paradigm themselves with much exhaus- 
tive but necessary prior methodological re- 
search. As Meehl (1955) comments: 


Considering the state of our knowledge, we still do 
not seem sufficiently daring and experimental about 
therapeutic tactics. Even when practical exigencies 
force a certain amount of trial-and-error, doctri- 
naire views about therapeutic theory are likely to be 
left unquestioned. The lessons would seem to be 
that we know so little about the process of helping 
that the only proper attitude is one of maximum 
experimentalism. The state of theory and its rela- 
tion to technique is obviously chaotic whatever our 
pretensions [pp. 374-375]. 


Some Miscellaneous Confusions 


The distinction traditionally made between 
process and outcome research seems to have 
clouded the thinking regarding design, par- 
ticularly in the latter area. The misconception 
seems to take the form: process is not out- 
come research, and outcome research is not 
process research, The position taken here is 
that these propositions are incorrect: to some 
extent process research is outcome research, 
and outcome research is equivalent to process 
investigation. 

The traditional process-outcome distinction 
is made as in the following: 


The studies to be summarized here can be roughly 
dichotomized into those with principal concern as to 
how changes took place, therefore focusing on the 
interchange between patient and therapist (i.e the 
process), and those that focus on the end point, to 
answer the question of what change took place (i.e. 
the outcome) [Luborsky, 1959, pp. 320-321]. 


Typically, process studies have dealt with the 
therapist-patient interview interaction, while 
outcome studies have focused on changes in 
the patient as the result of therapy. Process 
has been studied by various content-analysis 
procedures as well as by scales or question- 
naires developed to measure therapist, pa- 
tient, or interactional dimensions; whereas 
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outcome investigations have focused on pe 
tient changes in test or other behavior fra 
the beginning to termination of therapy. 

Two unfortunate effects seem to have fd 
lowed from this somewhat ambiguous distin. 
tion: “outcome” researchers have tended & 
focus exclusively on pre-post patient differe 
tiations; and patient process changes have mt 
been considered legitimate outcome. 

In the first place, the exclusive reliane 
upon pre-post measurement in outcome dè 
signs may lead to findings that are invalid « 
terminate research prematurely. For exampk 
if patient improvement (as tapped by a pat 
ticular criterion measure) is not a monotonit 
function but rather curvilinear in some lawfi 
fashion, a focus on only two points of tim 
may obscure or distort meaningful patien 
improvement. To the extent that the function 
of outcome change is unknown, it seems that 
repeated-measures designs have just as legiti- 
mate an application to outcome studies as t 
interview-by-interview process changes, Fut- 
ther, pre-post designs demand that one havt 
highly reliable change measures before le 
can expect to tap sensitively any improvement 
that occurs. As Chassan (1962) observes: 


It becomes apparent that mere end-point observ 
tions for the purpose of estimating change in tht 
patient-state after, say, the intervention of som 
form of treatment places generally severe limitt- 
tions on the precision of the estimation of the 
change. For random fluctuation in the patient stat 
can then easily be mistaken for systematic chang 
To overcome this difficulty, frequent repeated ob- 
servations must be made of each patient in th 
study [p. 615]. 


Thus, it first seems apparent that the tradi 
tional process-outcome distinction has pt 
petuated the relatively exclusive use of piè 
post designs in outcome studies, with the ul 
fortunate effect that information about thè 
form of the function which represents the 
improvement between the two end points, & 
well as for follow-up periods, has not beet 
clarified; whereas repeated-measures desigis 
would offer this essential type of information. 
Secondly, the use of only two measurement 
points has increased the likelihood that any 
differences observed may be only chance fluc 
tuations due to unreliability of the measur 

The second unfortunate result of the prot 
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es-outcome dichotomy has been that patient 
process change within the interview has not 
been considered explicitly as legitimate out- 
come, It seems clear, however, that patient 
improvement manifested in his interview be- 
havior is just as legitimately outcome as any 
other form of extra-therapy change. Certainly 
not all process investigation is equivalent to 
outcome—for example, if the investigator is 
focusing exclusively on the therapist, or on 
one point only of the therapy sequence. But 
to the extent that one is investigating in- 
therapy patient changes he is concerned di- 
rectly with outcome; and to the extent that 
one is interested in outcome, he needs to be 
cognizant of in-therapy patient changes (a 
point rarely mentioned). To say this differ- 
ently, there seem to be two important areas 
of patient change: that change manifest in 
the therapy hours themselves, and concomi- 
tant changes observed outside the therapy 
interaction (in situ). Process research begins 
with the in-the-interview behavior of the 
patient; outcome investigation begins with 
his outside-the-interview improvement. The 
crucial implication is that for either to be 
maximally useful, it must consider the other 
focus or perspective. It is necessary for both 
investigators to formulate some clear para- 
digm of the dependent variables of psycho- 
therapy, both in- and extra-therapy, and their 
theoretical interrelationships. 

It seems, then, that the process-outcome 
confusion has resulted primarily from ignor- 
ing the fact that some interview data reflects 
outcome (patient change); or, said differently, 
that some of the outcome of therapy may be 
vident in the interviews. Perhaps it would be 
helpful to discard these terms, instead refer- 
‘hg to in-therapy (interview) studies (via 
direct observation, movies, tape recordings, or 
transcripts) and extra-therapy investigations 
(dealing with “in situ” observations). It must 
also be added that since the statistical func- 
tion of these in- and extra-therapy changes 
unknown, one should seriously consider the 
‘pptopriateness of repeated-measures designs 
i attempting to evauate the effects of psycho- 
therapy, 

a difficulty in psychotherapy re- 
RS as been connected with the scientific 
S of current diagnostic categories for 
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mental illness. The problem stems not only 
from the fact that the classifications systems 
are unreliable (Arnhoff, 1954; Ash, 1949; 
Dayton, 1940; Doering & Raymond, 1934; 
Mehlman, 1952; Schmidt & Fonda, 1956; 
Wilson & Deming, 1927) but equally im- 
portantly from the fact that differential diag- 
nosis makes no difference—that is, leads to no 
prescribed differential psychotherapeutic treat- 
ment. A suggested answer to both of these 
nosological difficulties is that we may be look- 
ing in the wrong places for a reliable and 
valid diagnostic scheme. Perhaps the answer 
to the classification problem lies in differen- 
tial patient behavior found in the therapy 
hour itself. If therapists in fact deal differ- 
ently in therapy with different patients, then 
perhaps the patient cues to which the thera- 
pist differentially responds can be isolated 
and reliably measured from that interaction. 
If the manner in which the patient talks 
about himself in therapy indeed provides a 
reliable differentiation of patients, then the 
likelihood seems good that the process dimen- 
sions isolated would be directly relevant to 
differential therapeutic techniques. It seems 
that this possibility has been overlooked to 
date. 

Finally, an unrealistic hope prevailing in 
psychotherapy research has been the belief 
that sooner or later “The Definitive Study” 
will be published which once and for all 
proves the effectiveness of psychotherapy and 
defines the process by which it works. This 
belief seems to have motivated the prolific 
subsidization of the large outcome research 
projects prevalent in the last decade. One of 
the functions of this paper is to demonstrate 
the infinitesimal probability that any one-shot 
research attempt will ever significantly ad- 
vance our knowledge in this area. The busi- 
ness at hand for therapy (just as for any 
other) research seems clear: painstaking in- 
volvement with delineated problems until 
repeated replication of individual findings has 
been demonstrated, and subsequent attack of 
closely related or ancillary questions. As See- 
man (1961) has observed, investigators need 
to “dispel the notion that some single research 
package is likely to be devised to answer a 
great many questions all at once.” Rather 
the pattern of research required is “one of 


128 


plugging away at small bits of knowledge 
which, only after an appreciable period of 
time, might attain a higher order of signifi- 
cance.” 


Tue SEARCH FOR A PARADIGM 


In the domain of psychotherapy there is no single 
shared paradigm commanding consensus. With con- 
siderable overlap the leading current paradigms are 
the psychoanalytic, learning theory and existential. 
Signs of crisis are to be found in each in the in- 
creasing recognition and public acknowledgement of 
limitations and impasses [Colby, 1964, p. 347]. 


The previous section described in some de- 
tail the theoretical inadequacies present in 
the existent models for psychotherapy. Fur- 
ther inadequacy lies in the fact that (with 
the possible exception of Rogers) these 
theoreticians or their disciples have not modi- 
fied their formulations in light of recent em- 
pirical evidence. For despite the fact that 
conceptualizations have been vague and the 
designs of studies leave much to be desired, 
there has emerged a consistent body of data 
at least regarding patient prognostic variables 
(e.g., social class, intelligence, and the like) 
which has not been incorporated into any 
theoretical system. Additional evidence is sug- 
gestive that therapist personality and expec- 
tations, as well as therapist-patient relation- 
ship factors are also critically related to thera- 
peutic outcome. Until these various factors 
are incorporated into theoretical formulations 
(existent or new) these models cannot be 
utilized meaningfully in therapy research— 
and therapy research will remain in a state of 
crisis. It is the purpose of this section to 
spell out the minimum requirements for a 
useful psychotherapy research paradigm by 
attempting to delineate the relevant factors 
suggested by current empirical data, by elab- 
orating on the methodological issues that have 
to be dealt with, and by doing this in a lan- 
guage sufficiently general to be applicable to 
researchers of differing orientations. The goal 
is not to construct an adequate paradigm, 
but to try to outline the minimal criteria any 
paradigm must satisfy before its adequacy 
can even begin to be considered, 

The plan of the following, therefore, is to 
examine critically the individual psycho- 
therapy situation in order to delineate, in 
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general terms, the independent and de 
variables as well as the confounding vy; 
that have been shown to be built into a 
complicating this system. Concomitantly, a 
attempt will be made to derive some sugges 
tions regarding experimental design in lightd 
these considerations—although this is not th 
primary focus since the statistical and desig. 
issues have been addressed quite competently 
by others (Campbell, 1957, 1963; Edwards& 
Cronbach, 1952; Patterson, 1956; Unde 
wood, 1957). The “variable-model” approad 
used follows very closely the research modd 
excellently presented by Underwood with 
however, an exclusive focus here on th 
therapy research situation. 


Independent Variable(s) 


What is the independent variable in psych 
therapy research? Clearly its choice depents 
upon one’s particular theoretical orientation 
or observational hunches. Just as clearly, il 
is evident from the above that present formi- 
lations have not specified in sufficient detal 
what these independent variables are: Fot 
analytical therapy it lies somewhere among å 
matrix of therapist attitudinal, technique, ani 
personality factors (e.g., ambiguity, interpre 
tation, personal maturity). For Rogerial 
therapy it lies somewhere in an interactional 
matrix of three therapist Conditions or atti 
tudes (positive regard, empathic understand- 
ing, and congruence). For behavior therapy 
it falls somewhere in the communication by 
the therapist of specific unlearning procedures. 
Obviously, more critical thinking needs to be 
given to the exact delineation of the therapist 
variable or variables instrumental in effecting 
patient change. 

Generally, it seems clear that the indepent- 
ent variable in psychotherapy has to lie some 
where in the therapist and his behavior. It 
seems necessary that some aspect of the thera 
pist (attitude, technique, personality chara 
teristic, and/or the like) be communicated t0 
the patient to some degree before one call 
expect the patient to change in some mannet. 
Ideally, there would be but one therapist di- 
mension communicated to the patient, thereby 
effecting beneficial changes in that patient. 
Practically, however, few if any theoreticians 
have talked of a single therapist dimension of 
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behavior as crucial, but list several or many, 
emphasizing as they go the extreme complex- 
ity of the therapy relationship. Very likely, 
then, the independent variable for psycho- 
therapy research needs to be multidimen- 
sonal, in that more than one aspect of the 
therapist are crucial antecedents of beneficial 
patient change. An implication of this point, 
and an equally crucial one, is that the exact 
interrelationship or patterning of the variables 
in the multidimensional model needs to be 
dearly specified in any theoretical formula- 
tion. 

Now, if psychotherapy were a one-time 
event this kind of model of independent 
variables might be adequate. However, since 
therapy is a sequential treatment procedure, 
one’s model needs to be concerned further 
with the dimension of time. Are the same 
&pects of the therapist operative in the same 
pattern over the entire therapy interaction? 
liso, this needs to be explicitly defined theo- 
retically, If, on the other hand, (as seems 
more likely) one or more therapist variables 
ae crucial at one phase of the interaction, 
and others are indicated at other periods, then 
itis necessary to specify these time interac- 
tions, Otherwise a researcher may be sampling 
at the inappropriate therapy period in his 
attempt to investigate specific theoretical di- 
mensions, Further, the model would be less 
tificult if psychotherapy were an agreed-upon 
perfect technique effecting changes regardless 
tf type of patient. However, since it seems 
more likely that “psychotherapy” represents 
i practice heterogeneous therapist perform- 
ance depending upon the kind of patient with 
whom he is dealing, then one’s model must 
delineate differential levels or classes of inde- 
pendent variables which are correlated respec- 
~~ with these patient individual differ- 

ces, 

Hence, if psychotherapy research is to pro- 
gess, it seems essential that theoreticians 
and/or investigators first define therapist be- 
havior in very precise terms: by specifying 
the dimensions along which they vary, by spe- 
tying the exact interrelationships among 

ese dimensions at separate time-points in 
the therapy interaction, and by specifying 

It differentiations for various kinds of lev- 
dls of patient disorder, 
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Just as much imprecision has been manifest 
regarding the dependent variable in psycho- 
therapy research. Here the Patient Uniform- 
ity Myth has dictated the search for the one 
patient “process” dimension along which 
beneficial patient change occurs. For analytic 
therapy, this dimension has resided some- 
where among such variables as insight (mak- 
ing the unconscious conscious), working 
through, reduction in anxiety and resistance, 
etc. For Rogers, it is found somewhere among 
his seven strands of process or Experiencing. 
For behavior therapy it seems to reside some- 
where in the process of anxiety reduction and 
symptom removal. 

Many of the same concerns expressed re- 
garding the independent variable are germane 
here. Does patient-beneficial change occur 
along one dimension or, more feasibly, along 
several dimensions? Are the same patient- 
change dimensions operative at all phases of 
psychotherapy? If so, what is their pattern- 
ing? If not, which dimensions are changing in 
what manner at the different phases? Do we 
need different dependent variables of change 
for different kinds of patients? If so, what 
are the diagnostic dimensions involved, and 
what are the respective differential patient- 
change processes? Are these dependent varia- 
bles manifest in the in-therapy verbal com- 
munications of the patient? If not, what are 
the extra-therapy manifestations of this 
change? If so, how are the in-therapy com- 
munication variables related to the extra- 
therapy manifestations? That is, how does the 
in-therapy process mediate changes in extra- 
therapy patient behaviors? It seems clear that 
the dependent variables of therapy are to be 
found somewhere in the in- or extra-therapy 
verbalizations and/or behavior of the patient, 
and in changes along these dimensions in a 
“positive” direction over the therapy sequence. 

In summary, then, the basic skeleton of a 
paradigm for psychotherapy seems to. be 
something like the following: The patient 
communicates something; the therapist com- 
municates something in response; the patient 
communicates and/or experiences something 
different; and the therapist, patient, and 
others like the change (although they may 
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like it to different degrees, or for divergent 
reasons). What the therapist communicates 
(the independent variables) is very likely 
multidimensional (and the patterning of this 
multidimensionality needs to be specified), 
and may be different at different phases of 
the interaction for different kinds of patients. 
Similarly, what the patient communicates and/ 
or experiences differently (the dependent 
variables) is likely multidimensional (and 
the patterning of that multidimensionality 
needs to be clarified) and may be different at 
distinct phases of the interaction. The enor- 
mous task of psychotherapy theory and re- 
search is that of filling in the variables of this 
paradigm. 

It should be added at this point that even 
if formulations were developed to the point of 
filling in this paradigm, there would still be 
many basic methodological problems before 
one could appropriately begin to process the 
raw data of therapy. These problems have 
been encountered in many process studies and 
concern such issues as unit size, segment loca- 
tion, context, interdependency of patient and 
therapist measures, data form and presenta- 
tion, and parsimonious interpretation. The 
empirical solution of these methodological 
problems is necessary before one can validly 
test theoretical propositions. Since these is- 
sues have been dealt with in detail elsewhere 
(Bordin et al., 1954; Kiesler, 1966) they will 
not be elaborated again here. Suffice it to say 
that many process investigators have com- 
pletely ignored these basic methodological 
issues. 


The Problem of Confounding 


Since the minimal requirements for a ther- 
apy paradigm have not been met by theoreti- 
cal formulations, confounding variables have 
contaminated much psychotherapy research, 
In fact, the major value of investigations to 
date has been the demonstration of the multi- 

` ple and varied sources from which confound- 
ing can occur. Until this contamination is 
dealt with, research findings will continue to 
be ambiguous and consequently subject to 
alternative interpretations. The essential goal 
of research is to “design the experiment so 
that the effects of the independent variables 
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can be evaluated unambiguously [Ui 
wood, 1957].” 

This final section will examine the varie 
sources of confounding that existing pam 
digms have ignored. It will utilize Unde.’ 
wood’s (1957) three classes of confounding 
variables (subject, task, and environmental] 
by applying them directly to the psychothe- 
apy situation. Hopefully, the result will bee 
listing of variables or variable domains whid 
need to be integrated into the above parè 
digm and/or whose contaminations need to bė 
eliminated by experimental or statistical cor 
trol procedures in individual research, 

Subject Confounding Variables. According 
to Underwood, subject confounding occurs 
when stable or temporary factors or dime 
sions in the experimental subjects are per s 
relevant to, or inducive of, differences in the 
dependent variable measures. If these subject 
factors are permitted to go uncontrolled, one 
cannot unambiguously interpret dependent 
variable changes as a function of manipuli 
tions of the independent variable. Psych 
therapy research of the past decade hs 
shown that virtually the entire domain ú 
patient characteristics (demographic, intel- 
lectual, motivational, semantic, perceptual e- 
pectancies) are relevant in greater and les 
degrees to reactivity to psychotherapy. A 
Luborsky (1959) notes: 


A number of studies... are concerned with 
identifying qualities of patients that are associated 
with staying in treatment, improving in treatment, 
or returning once having left treatment, There alt 
many such studies now in the literature. They havt 
not yet come to anything definitive, but they sem 
on the verge of it, for the results of all of thet 
studies point in the same directions. To put these 
trends simply: Those who stay in treatment im 
prove; those who improve are better off to begi 
with than those who do not; and one can predic! 
response to treatment by how well they are t0 
begin with [p. 324]. 


And Strupp (1962) observes: 


“It is becoming increasingly clear that therapists 
have fairly specific (and valid) notions about the 
kinds of attributes a “good” patient should posts 
as well as about those attributes which make 4 
patient unsuitable for the more usual forms of it- 
vestigative, insight-producing psychotherapy. Patients 
considered good prognostic risks are described ® 
young, attractive, well-educated, members of thè 
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sper middle class, possessing a high degree of ego- 
sengih, some anxiety which impels them to seek 
tip, no seriously disabling neurotic symptoms, 
mhtive absence of deep characterological distortions 
wd strong secondary gains, a willingness to talk 
out their difficulties, an ability to communicate 
yell, some skill in the social-vocational area, and 
a value system relatively congruent with that of 
the therapists. Such patients also tend to remain in 
therapy, profit from it, and evoke the therapist's 
bet efforts [pp. 470-471]. 


Hence it should be now apparent that pa- 
tents manifest individual differences along 
many outcome-relevant and likely interrelated 
dimensions. A tentative listing of subject con- 
founding variables would include the follow- 
ing: verbal intelligence, age, motivation for 
therapy, level of general anxiety, level of ego- 
strength, type of disorder, severity of disorder, 
type of onset of disorder, socioeconomic back- 
gound, patient expectancies, verbal expres- 
aye ability, level of occupational success, 
‘ype of value system, and likely others. In 
view of this multiplicity of confounding fac- 
trs, it should be abundantly clear why per- 
itting patients to select themselves for re- 
arch studies has made results incomparable 
ad nonreplicable; why the Patient Uni- 
fomity Assumption is indeed a myth; why 
a adequate paradigm of therapy needs to 
Ieorporate these variables into its structure 
by tying together the patterning of these 
factors, and by stipulating alternative treat- 
ment procedures for patients at different levels 
Of these dimensions.‘ 

It also becomes clear why, in investigating 

effects of one’s independent variable or 
Variables, it is necessary to control for these 
‘onfoundings. The control problem is basically 
‘hat of equivalent groups (Underwood, 1957). 
The experimental alternatives are random as- 
Sgnment of subjects to treatment groups, or 


As Rapaport (1960) urges: “Therapies or thera- 
sts... end up by establishing their own McCar- 
ee sooner or later they announce that this or 
in kind of patient is not the right kind for their 
d of therapy. Not rarely they go further and 
pane that this or that kind of patient is ‘not 
tatable? In the long run, psychological theories of 
ie Must come to a point where they will make 
Possible to select the therapy which is good for a 
th eit and not the patient who is good for a 
“apy [p. 115]. 
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prior matching of patients to form equivalent 
groups. The former procedure seems to be a 
good first solution to this difficulty. By ran- 
domly assigning patients, one is statistically 
assured that the various treatment groups 
will be comparable regarding subject char- 
acteristics, will not differ on relevant subject 
variables other than in chance directions. 
However, one needs relatively large samples 
and may not be able to define clearly the 
population from which his sample came. One 
also has an error term which is larger than 
it otherwise need be, in that much of the error 
variance is identifiable—can be traced di- 
dectly to the kind of relevant patient variables 
listed above. Finally, one is not permitted to 
ascertain interaction effects between psy- 
chotherapy treatment (or ideally different 
types and/or amounts of psychotherapy) and 
these known-to-be relevant subject character- 
istics. 

A more efficient procedure for controlling 
subject confoundings seems to be by experi- 
mental and/or statistical (partial correlation 
or covariance) matching procedures. By these 
procedures, patients in different treatment 
groups are matched on one or more relevant 
dimensions, thereby establishing group equiv- 
alency for these measures. The optimal pro- 
cedure would seem to be to introduce these 
patient variables into the research design in 
the form of independent variables, so that 
the meaningful interactions between treat- 
ments and these organismic variables can be 
ascertained. For example, different depths of 
interpretation may be more effective for pa- 
tients of low in contrast to high levels of gen- 
eral anxiety. Or, different levels of therapist 
activity or directiveness may be more benefi- 
cial for patients of low in contrast to high 
socioeconomic background. It seems that if 
the psychotherapy research paradigm is to 
begin to be filled in, this latter kind of inter- 
actional research (factorial designs) is es- 
sential. This seems to be the conclusion of 
educational research, an area having quite 
similar methodological and design problems. 
As Edwards and Cronbach (1952) observe: 
Educators spent a generation on studies of the 


oversimplified “Is A better than B?” type. They 
sought to settle by experimentation whether large 
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classes were better than small, lectures better than 
laboratories, frequent tests better than few. Their 
studies led to endless contradiction because, as you 
will notice, the question did not specify the organ- 
ismic variables. . . . [One investigator] showed that 
his Method A was better than B for bright, mediocre 
achievers, but that B was better for those of 
mediocre intelligence but good past achievement. 
The inclusion of both organismic variables in the 
design was essential if he was not to reach an over- 
simple, hence untrue, conclusion. . . . The writers 
agree that effort to isolate effects due to organismic 
variables can have only a beneficial effect, and that 
cases should be selected to represent as much varia- 
tion as can be. It is far more valuable to study ten 
cases, two each of five identifiable subtypes, than 
to study a pool of fifty undescribed and undiffer- 
entiated people. . . . The most promising (i.e., the 
most likely to be relevant) variable should be built 
into the design so that gains can be assessed separately 
for each variable [pp. 53-54]. 


Task Confounding Variables. Task vari- 
ables refer to dimensions or aspects of the 
experimental apparatus or stimulus, other than 
the experimenter-defined independent vari- 
able, which per se are relevant to, and in- 
ducive of, changes in the dependent variable 
measures. Task confounding comes from 
aspects of the experimental task (apparatus 
or stimulus) on which the experimenter is not 
focusing, aspects other than his arbitrarily 
defined independent variable. 

In psychotherapy research, task confound- 
ing is possible whenever one arbitrarily defines 
the independent variable. If one’s empirical 
hunch or theoretical framework implies that, 
for example, depth of interpretation is the 
crucial therapist dimension (leading to differ- 
ences in the dependent variable, e.g., level 
of patient insight), then one would like to 
conclude that in fact manipulation of therapist 
depth of interpretation effected the different 
levels of patient insight. But task confounding 
occurs and confuses the situation, if, for 
example, the therapist’s empathic understand- 
ing (rather than depth of interpretation) 
could also be responsible for the differences 
obtained. If empathy is related to insight, 
and if it is not controlled in the above situa- 
tion, interpretation of the insight differences 
obtained will be ambiguous. 

Recent psychotherapy research has in- 
dicated that a number of therapist variables 
may be related to patient improvement. A 
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tentative listing would include: therapist æ 
perience level, prestige, occupational inten 
pattern, enthusiasm or confidence, verbal æ 
inforcement, therapist expectancies, therapi 
“Conditions,” depth of interpretation, likis 
for the patient, degree of ambiguity, dewe 
ditioning, therapist orientation, therapist pe 
sonality, and likely others. With this extensie 
list of therapist or task confounding variable 
it becomes essential that theoretical paz 
digms attempt to incorporate these dime 
sions, as well as explain their interrelatie 
ships. 

Also, in light of the above, it should b 
abundantly clear that mere arbitrary deft 
tion of one aspect of the therapist as ones 
independent variable does not excuse an it 
vestigator from considering other therapi! 
factors that may be concomitantly operati 
and of themselves producing the results ob 
tained. One needs to tackle the arduous tat 
of attempting to measure and experimentally 
or statistically control these other factors it 
order to eliminate task confounding for hb 
particular study. Or, one can incorporate thet 
factors as additional dimensions (addition 
independent variables) in his design so thi! 
possible interactions between these varioi 
therapist dimensions and patient change cal 
be determined. Quite likely the crucial thet 
pist communications are multidimensional 
and researchers need to be acutely attuned 
to the possible covariance and/or interactio 
of other therapist dimensions with the pi 
ticular one in which they may be interested. 

Environmental Confounding Variables, BY 
vironmental variables refer to all nontak 
(nonapparatus) variables which change col 
currently with manipulation of the indepen 
ent variable, and which per se are relevant ti 
and inducive of, changes in the dependent 
variable measures, Examples of this kind o 
confounding in psychological research would 
be the influence the examiner has, independ 
ent of the ink blots, in Rorschach research; 
or the effect humidity has on GSR responses 
independent of other manipulations; or time 
of day, in contrast to type of instruction, in 
educational research, 

The basic reason for having a control group 
in traditional evaluative studies of therapy 5 
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w hold constant this possible contamination 
from environmental sources. This especially 
neds to be done for psychotherapy research 
since time of itself is an important variable 
visa-vis patient changes—more exactly that 
etra-therapy events that may be crucially 
relevant to patient changes can occur over 
time. The death of a close friend or relative 
may occur, causing a relapse in the patient. 
He may meet a new friend who may effect an 
important difference in the patient’s percep- 
tion of himself and his behavior. His job situ- 
ation may change: he could be promoted, find 
the opening he has been waiting for; or on the 
other hand, he may be suddenly fired or laid 
off, Obviously, any number of possible inter- 
personal events may occur which can be 
directly relevant to the patient’s psychopatho- 
logical condition, with the result that changes 
apparent in a patient may have little to do 
with the psychotherapy he is receiving con- 
wmitantly, Hence, a control group is needed 
which receives no therapy, as a base line of 
improvement from environmental events to 
which the condition of therapy can be com- 
pared. 

Further environmental contamination can 
come from the fact that some of the patients 
may be receiving concomitant medication 
which can be responsible for patient behavior 
changes, Other environmental variables not 
often mentioned would be differences in the 
fce situation in which therapy occurs (e.g., 
degree of soundproofing; comfort of chairs; 
lor of room; pictures on the wall or not, 
ad which kinds; seating arrangement of 
therapist and patient; warmth of reception- 
st; privacy of reception room and office; 
ic), These variables are likely not exten- 
ively operative in the therapy interaction, 
but to the extent that they are they must be 
‘ontrolled or varied systematically. 
ae a since one can seldom have con- 
thi ver patients’ environments (although 
Pai represents the real potential of 
Rs ae studies) a control group for en- 
tae events is essential before one can 
te nfident in concluding that either a par- 

ar therapy treatment or several treat- 


Bi are more effective than no treatment 


Conclusion 


This final section has attempted to spell 
out in detail the classes of variables that need 
to be incorporated into theoretical formula- 
tions and/or controlled for in psychotherapy 
research before replicable findings can result. 
The first crucial implication of these paradigm 
considerations is that theoretical formulations 
can no longer ignore the various domains of 
variables shown to be relevant to psycho- 
therapy outcome.’ The second, and equally 
vital, implication is that psychotherapy re- 
search can no longer ignore the necessity for 
factorial designs. Although this latter point 
has been emphasized in the past, few re- 
searchers seem to incorporate the recom- 
mendation. As recently as 1959 Berdie ob- 
served: 


When this author reviewed current research in 
counseling nine years ago, he concluded, an area of 
research importance to every counselor, but in 
which this reviewer could find no research publica- 
tions, concerns the relationship between diagnosis 
and therapy. What diagnostic categories and tech- 
niques are most useful in selecting appropriate thera- 
pists? Near the conclusion of this nine-year period 
one group of authors reported the first substantial 
study of therapy that took into account the effects 
of precounseling attributes, differential counseling 
methods, and outcomes (Ashby et al., 1957). Al- 
though the results of this study were rather meager, 
certainly this study itself should be considered a 
pioneer one [p. 345]. 


In short, the research message seems clear: 
Current paradigms are inadequate. The time 
is long overdue not only to acknowledge, but 


5For a detailed discussion of a variable model 
similar to the one presented here, see Levinson 
(1962), who presents a list of seven domains of 
therapy variables: (a) relatively stable personal- 
social characteristics of the therapist, (b) relatively 
stable personal-social characteristics of the patient, 
(c) characteristics of the patient-therapist pair, (d) 
stages in the treatment career, (e) overall treatment 
outcome, (f) the institutional setting of treatment, 
and (g) the social context of the patient’s life. Levin- 
son concludes: “If we are to forego theoretical unity, 
we must at least have a common framework on 
which to hang our differences... - To’ the extent 
that [this framework] in fact takes into account 
the events and variables dealt with in the diverse 
studies, it may serve to sharpen our view of the 
ways in which these studies overlap, converge, com- 
plement each other, and stand in direct opposition 


[p. 141.” 
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also to meet head-on in theoretical formula- 
tions and research investigations the minimal 
requirements of a psychotherapy research 
paradigm. 


Now that the mystery and aura are being removed, 
the psychotherapy relationship is being seen by 
varieties of researchers as a phenomenon which is as 
fruitful for investigation as are the parent-child, 
peer-peer, teacher-student, experimenter-subject, and 
other important human groups. Also, importantly, it 
is being seen in its true perspective: i.e., as no more 
important or mysterious than any of these other re- 
lationships. [Matarazzo, 1965, p. 219] 
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SEX AND ANXIETY DIFFERENCES IN 
EYELID CONDITIONING * 


KENNETH W. SPENCE axo JANET T. SPENCE 


University of Texas 


Data concerning the relationship between eyelid-conditioning performance and 
2 subject (S) variables, Ss’ sex and scores on the Manifest Anxiety Scale, 
are examined. In studies employing standard procedures, high-anxiety Ss 
were superior to low-anxiety Ss in 23 of 27 comparisons, and females superior 
to males in 18 of 19 comparisons. In contrast, the direction of the differences 
was split approximately equally between high- and low-anxiety Ss and between 
males and females in studies in which conditioning was presented within the 
context of a masking probability-learning task. 


In the present report, evidence bearing on 
he relationship between eyelid conditioning 
æd two subject variables will be examined, 
mely, the sex of the subject and anxiety 
(ive) level as inferred from scores on the 
Manifest Anxiety Scale (Taylor, 1953). A 
comparison will be made for each of these 
variables of the data obtained from studies 
wing traditional conditioning procedures and 
those obtained from studies employing a tech- 
sique in which the conditioning procedures are 
masked or disguised by presenting them in the 
context of a probability-learning task (Spence, 
1963, 1966). In this latter type of situation, 
the experiment is described to the subject 
($) as being one in which the experimenter 
(E) is studying the effects of irrelevant, dis- 
tracting stimuli (a paired tone and puff of air 
l0 the corner of the eye) on Ss’ capacity to 
kam a problem task that, in studies cited 

, involved predicting which of two lamps 
would light over a series of trials. 


Pid paper is part of a project concerned with 
: influence of motivation on performance in con- 
2 and learning. Preparation for publication 
Nie tenes by Contract Nonr 375 (18) between 

University of Texas and the Office of Naval 
tion ra (K. W. Spence), and by the Hogg Founda- 
T or Mental Health, University of Texas (Janet 
~ Spence), 


PERFORMANCE AS RELATED TO ANXIETY 
(MAS Lever) 
Standard Conditioning Situation ~~ 

In a recent article, one of the present writ- 
ers (K. W. Spence, 1964) reviewed the avail- 
able evidence concerning the relationship be- 
tween conditioning performance and Ss’ scores 
on the Manifest Anxiety Scale (MAS). In 21 
of 25 independent comparisons, all of which 
were drawn from studies employing standard 
conditioning procedures, the performance of 
high-anxiety (HA) Ss was found to be su- 
perior to that of low-anxiety (LA) Ss. Two 
additional studies (Beck, 1963; Ominsky & 
Kimble, 1966) have since reported positive 
results, each demonstrating a significant per- 
formance difference between high- and low- 
anxiety groups in, favor of the former. 

Thus in 23 out of 27 studies conducted in 
the standard conditioning situation, HA Ss 
have performed at a higher level than LA Ss. 
The probability of obtaining such a percent- 
age of differences (85.2%) in the same direc- 
tion by chance is less than .01. Moreover, all 
of the 15 differences that were significant 
were in the direction of higher conditioning 
performance on the part of the HA Ss, where- 
as none of the four differences in favor of the 


LA Ss were significant. 
+ 
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TABLE 1 


COMPARISON OF PERFORMANCE OF HA Anp LA Ss IN MASKED CONDITIONING SITUATION 


Study 


. Spence & Rutledge (1964) 


. Spence (Unpublished) 
. Spence (Unpublished) 


1 17 
2. Homzie (1964) 48 
3. Spence (1966) 48 
4. Spence (Unpublished) 39 
5 
6 


Note—None of the above differences is statistically significant. 


Masked Conditioning Situation 


The results obtained from a number of re- 
cent studies that employed the masking pro- 
cedure are presented in Table 1. None of these 
studies was specifically concerned with the 
relation between anxiety and performance, 
but as the scores on the MAS were available, 
it was possible to compare the conditioning 
performance of Ss who scored in the upper 
and lower quartiles of the distribution of 
scores on the scale. The data presented for 
each experiment are the mean percentages of 
CRs given in the last 20 trials of the acquisi- 
tion period. The means were compared sta- 
tistically by analyses of variance which in- 
cluded terms for all experimental variables 
(e.g., intensity of UCS) being manipulated in 
a given study, as well as S’s anxiety level. 

The first column of the table gives the ref- 
erenced study, if published. The number of 
Ss is shown in the second column, while the 
mean percentage of CRs for the two groups, 
HA and LA, and the difference between them, 
are given in the following three columns. The 
P values of the Fs for the main effect of anxi- 
ety level are noted. 

As may be seen, the results obtained with 
the masking situation are in sharp contrast 
with the findings obtained in the standard 
conditioning situation, Not only is none of 
the differences significant, but in half of the 
six experiments the HA Ss responded at a 
lower level than the LA Ss, 


PERFORMANCE AS RELATED TO SEx 
Standard Conditioning Situation 


Although studies of the effects of particu- 
lar experimental variables on conditioning 


Percentage of CRs Difference 
HA LA (H-L) 
48.4 50.6 —22 
71.6 73.1 -15 
68.0 65.2 28 
72.3 67.9 43 
63.0 $1.7 11.3 

46.8 —9.8 


37.0 


performance have frequently attempted i 
control for a possible sex difference, compa 
sons of the performance of men and wom 
have seldom been reported in the literatw 
In their early article on individual differen: 
Campbell and Hilgard (1936) noted the fs 
that their female Ss responded at a somewh! 
higher level than did the male Ss, but the di 
ference was not significant. After finding the 
their female Ss just failed performing signi? 
cantly higher than men at the .05 level, Spex 
and Farber (1953) cited evidence from the 
study and from an earlier investigats 
(Spence & Taylor, 1951) which indicated the 
in six comparisons involving extreme group 
on the MAS, the female Ss responded at: 
higher frequency level than the male $s. 
Table 2 presents the results of an analys 
of all the data available from published ani 
unpublished studies conducted under t 
standard procedures in the senior author’ 
laboratory. The information is similar to the 
of Table 1 except that the comparison is be 
tween male and female Ss rather than HA 
and LA Ss, The data from each study, th 
mean percentages of CRs that occurred in th 
last half of the conditioning trials, were agai! 
compared by means of analysis of variance 
and the $ values of the obtained Fs are given. 
Examination of the table reveals first that 1$ 
of the 19 comparisons found the women tè 
sponding at a higher level than the men. I! 
there were no relation between sex and condi- 
tioning performance, the probability of ob- 
taining such a percentage of diferenc 
(95%) in the same direction by chance is les 
than .001. Looking next at the $ values, it may 
be seen that slightly less than one-third of the 
studies were significant, five at the .05 level, 
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TABLE 2 


Comparison OF PERFORMANCE OF MALE AND FEMALE Ss IN STANDARD Conprronwine Servareow 


1. Spence & Taylor (1951) 

2 Spence & Farber (1953) 

3. Mcallister (1953) 
Taylor, E. (1954) 


4 
5 (1958) 
6 Spence, Haggard, & Ross, Exp. I (1958) 
7. Spence, Haggard, & Ross, Exp. II (1958) 
& Runquist & Ross (1958) 
9. Runquist & Ross (1959) 

10. Runquist & Spence (1959b) 

11. Runquist & Spence, Exp. I (1959a) 

12. Runquist & Spence, Exp. II (1959) 

13. Spence & Ross (1959) 

H. Ross & Hunter (1959) 

15. Ross (1959) 

16. Ross & Spence (1960) 

17. Spence & Weyant (1960) 

18. Spence, Homzie, & Rutledge (1964) 

19. Spence (Unpublished) 
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Perceat CRe 
Female Male Dige 

100 61.5 49.1 na~” 
6i 56.8 44.0 128 
45 64.2 46.3 17.9% 
100 a3 55.5 8s* 
72 628 50.5 123° 
100 474 46.6 8 
80 40.3 30.3 10.0 
60 56.3 51.6 47 
90 53.0 40.9 12.1 
88 47.5 453 2.2 
80 64.2 46.0 18.2%% 
80 47.9 3.6 124% 
49 72.1 GA 87 
66 54.8 384 15.6°° 
40 66.6 66.2 A 
30 45.2 16.9 28.3% 
72 55.3 573 —2.0 
4 67.0 65.1 19 
120 50.1 45.2 49 


ad one at the .01 level. Two more were sig- 
“cant at the 10 level. Thus a significant 
Sierence failed to appear in 11 studies de- 
te the fact that the number of Ss in some 
S quite large. Taken altogether, the evi- 

Suggests that while performance does 
my significantly with sex, a very small por- 
“ of the intersubject variance in condition- 
S Performance is accounted for by this fac- 
a owever, studies investigating the effects 
ta experimental variables had best take 
t of it in designing experiments and 


Masked Conditioning Situation 


Table 3 presents the findings relevant to 
sex differences obtained in recent experiments 
in which the conditioning was masked by the 
probability-learning task. In spite of some 
reasonably large samples, not one of the dif- 
ferences in percentage of CRs made by the 
two sexes is statistically significant. More- 
over, quite in contrast to the highly con- 
sistent pattern of superiority on the part of 
females in the standard situation, almost half 


“alyzing data from them. (4 out of 9) of the comparisons in the mask- 
TABLE 3 
COMPARISON OF PERFORMANCE OF MALE AND FEMALE Ss IN MASKED CONDITIONING SITUATION 
Percent CRs Difference 
Study Female M (F-M) 
1. Goldstein 514 3.6 
1962) 100 55.0 
pico ah —0.1 
utledge (1963) 54 45.5 45.6 
om 25 49.3 60.9 —11.6 


- Spence, Homzie, & Rutledge (1964) 
: Spence & Rutledge (1964) 
f. Homie & Weis (1965) 
8 

; Spence (Unpublished) 
® Spence (Unpublished) 


823 


Notey 
lone of the above differences is statistically significant. 
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ing situation showed the males responding at 
a higher level than the female Ss. Indeed, the 
three largest differences were in favor of the 
men. This reversal is thus seen to closely 
parallel the change in the case of the anxiety 
variable; in both instances a factor which had 
been shown to be significant under one set of 
conditions failed to be related to conditioning 
performance in the second. 


Discussion 


The difference in conditioning performance 
of the HA and LA Ss obtained in our earlier 
studies was interpreted in terms of our theo- 
retical construct of generalized drive (D) 
(Spence, 1956). According to this formula- 
tion, the performance difference reflects a 
difference in level of D which in turn was as- 
sumed to be the result of a difference in the 
level of emotional reactivity of the two groups 
to the experimental situation and procedures. 
Guided by this theory, our experimental situa- 
tion and procedures have been deliberately 
arranged to produce some degree of appre- 
hensiveness on the part of S. To this end, S 
has been seated in a dental chair, isolated in 
a strange, darkly illuminated room, and given 
a minimum of information as to what be- 
havior was expected of him. Furthermore, 
since it was evident that students are much 
more likely to be concerned and apprehensive 
in their first psychological experiment, our 
studies directly concerned with the anxiety 
variable employed only individuals with no 
previous experience as an S. Under these con- 
ditions of maximizing the likelihood of differ- 
ential emotional responsiveness, studies from 
the Iowa laboratory consistently found sig- 
nificant performance differences between HA 
and LA Ss (Spence, 1964). 

Some confirmation of the relevancy of these 
factors is revealed by the most recent report 
from the Duke laboratory (Ominsky & Kim- 
ble, 1966). Employing physical conditions 
that, on the surface at least, appeared to be 
much less threatening than ours, Kimble and 
his associates (King, Kimble, Gorman, & 
King, 1961) had earlier failed to obtain su- 
perior performance on the part of HA Ss in 
three experiments, Experimenting in a new 
laboratory that more closely approximated the 
features of ours (dental chair, isolated room, 
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etc.) however, Ominsky and Kimble obtain 
positive results (~ <.025). In line with 
implications of our theoretical interpretata 
they suggested that their new conditions a 
procedures succeeded in arousing differenti) 
anxiety in their Ss and thus produced the 
ing of a positive relationship between anxiety 
and eyelid-conditioning performance, 
This brings us to the problem of the au: 
ors’ more recent negative findings obtained ® 
the context of the masking task. How is 
obvious interaction between anxiety ei: 
and the standard and masking type situatia 
to be explained? One can rule out physi 
conditions which were more or less the sam 
A factor that might have played some role i 
that the Ss run in recent investigations wit 
the probability-learning task have not all bet 
naive, that is, without prior experimental & 
perience, as was the case in our earlier studië 
directly concerned with the effects of anxiel} 
on conditioning. In the previous review of i 


are suggestive and further research should tt 
directed at the study of this factor. 

Any further discussion of this phenomen0! 
must necessarily be quite speculative, for cont 
parative data on performance in the two sill 
ations—standard and masking, in which al 
other variables but anxiety have been oot 
trolled—do not exist. It is not even knowl 
for example, whether the elimination of tit 
performance difference represents a relative) 
greater lowering of the HA Ss performance 0 
is the result of a relatively greater increase !! 
the performance level of the LA Ss, The for 
mer finding would imply that the masking 
procedure results in less emotionality beig 
aroused than the standard arrangement, tht 
latter that the emotionality is increased 0 
that some other source of increased D is i 
troduced. 

Obviously our theory is not adequate 0 
complete enough to predict what effects tht 
introduction of the masking task would have 
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on the general D of the S. There are so many 
alternative hypotheses that we prefer to wait 
until there is more clear-cut evidence as to 
what the disappearance of the performance 
differential implies, that is, increased or de- 
creased D. Such knowledge will at least serve 
asa constraint on the amount of speculation. 

Finally, with regard to the question of sex 
differences in conditioning performance, the 
widence presented strongly suggests that 
women perform at a higher level than men in 
the standard conditioning situation. The dif- 
ference is not quite as large or stable as in 
the case of the anxiety variable, but its con- 
sistency of occurrence is nevertheless sta- 
tistically significant. The interpretation that 
Spence and Farber (1953) have offered of this 
difference is essentially the same as that in 
the case of Ss who score at the extremes of the 
anxiety scale. They have suggested that the 
conditioning situation arouses greater fear or 
emotionality in women than in men and hence 
women have a higher D during the course of 
the experiment. This assumption of a com- 
mon theoretical basis for these two sets of 
differences receives further encouragement 
ftom the fact that the introduction of the 
masking conditions had the same effect on 
them, namely the elimination of the perform- 
ance difference, 
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A REVIEW AND INTERPRETATION OF SOME ASPECTS 


OF THE INFANT-MOTHER RELATIONSHIP IN 
MAMMALS AND BIRDS 


DONALD LEE KING: 


Stanford University 


Although important research in the area of the infant-mother relationship has 
been increasing, research appears to be guided by an empirical orientation—a 
comprehensive theory is lacking. In this paper the literature on infant-mother 
relationships in both mammals and birds is reviewed and given a theoretical 
interpretation. 2 hypotheses are proposed. They are that in all mammals and 
birds: (a) novel stimuli elicit fear responses in both young and adult 
organisms, and (b) certain stimuli elicit a pleasant feeling in the offspring and 
lead to the permanent reduction of fear of novel stimuli, and that these stimuli 
are always found in the person of the natural mother. The nature of a basic 
survival problem that all mammals and birds face is indicated, and it is 
demonstrated how the 2 above hypothesized phenomena constitute a very 


efficient solution of the problem. 


Although parallels in the nature of the 
infant-mother relationship in mammals and in 
birds have sometimes been pointed out, the 
basic similarities have not been sufficiently 
appreciated. The similarities to be discussed 
include the fear response of the mammalian 
and avian organism to novel stimuli and the 
tole that certain stimuli of the maternal 
object play in reducing the fear. 


QUALITIES AND PARAMETERS OF THE 
MATERNAL STIMULUS WHICH 
ATTRACT THE INFANT TO 

THE MOTHER 


Rhesus monkeys and dogs are attracted to 
surrogate mothers that provide soft tactual 
stimulation, but they are much less attracted 
t0 wire surrogate mothers (Harlow & Zim- 
merman, 1959; Igel & Calvin, 1960). There 
ate suggestions in the literature that infant 
sheep, goats, elk, and moose are attracted 
to the sight of moving objects (Altmann, 
1963; Blauvelt, Richmond, & Moore, 1960; 
Hess, 1958), 

More attention has been paid to the stimu- 
uS parameters that attract the infant bird 
to the surrogate mother. It has frequently 
Ry found that a wide range of different 
Ypes of stimuli can be imprinted to by the 


nae author was supported by United States 
as Health Service Predoctoral Fellowship 5 F1 
“16,799-02 while writing this paper. 


143 


same species (Moltz, 1960). Birds have been 
found to imprint to a flashing light (James, 
1959) and rhythmic sounds (Fabricius & 
Boyd, 1952-53). Moltz (1963) suggests 
that there is a common parameter to all the 
stimuli that have been found to lead to im- 
printing; he believes that imprinting occurs 
to low-intensity stimuli. As experimental sup- 
port for this belief, he restricted Peking 
ducklings so that they could see but not 
follow an imprinting object and found that 
imprinting occurs more strongly to a retreat- 
ing object than to an approaching one. He 
points out that the proximal retinal intensity 
of the retreating object is decreasing and 
suggests that it is the decreasing intens- 
ity which is responsible for the stronger 
imprinting. 

There are several problems with Moltz’s 
theory: (a) He never makes clear whether 
the critical parameter of the imprinting stimu- 
lus is decreasing intensity, low intensity, or 
both. In his experiment, the retreating object 
has a decreasing proximal intensity, but its 
average proximal intensity is equal to that of 
the approaching object. But in spite of this, 
Moltz slips into talking of the critical pa- 
rameter as a low-intensity one. He is forced 
to do this because in the James and Fabricius 
and Boyd studies mentioned above the proxi- 
mal intensity of the imprinting object in- 
creases if the bird approaches. (0) Moltz 
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does not consider the possibility that although 
imprinting may occur to such low-intensity 
stimuli as lights and rhythmic sounds, it may 
occur much more strongly to a more standard 
imprinting stimulus such as a retreating ob- 
ject. (c) Some species (mallards and birds 
of the Lemicolae) imprint to auditory stimuli 
of low intensity, but not to low-intensity 
visual stimuli (Lorenz, 1937). Also, mallard 
ducklings approach sounds of equal intensity 
but of unequal pitches to different extents 
(Collias & Collias, 1956). 

It is reasonable to suppose that the nature 
of the stimuli that attract the infant are a 
function of the environment that the infant is 
exposed to shortly after birth. An infant bird 
born in an environment where members of 
the same species other than the mother fre- 
quently walk by the nest will most likely 
imprint to a more specific stimulus than one 
who has only its mother to follow. Likewise, 
an infant bird born in a very dimly lit en- 
vironment might be expected to imprint to 
auditory stimuli more readily than visual 
stimuli. Consider a mammalian example: it 
is difficult to conceive of an infant whale 
being attracted to a soft, dry, furry stimulus 
precisely because such stimuli are normally 
never found in its immediate vicinity. 

It is clear that different stimuli vary in the 
effectiveness with which they attract the in- 
fant animal to them. The range of stimuli 
that attract the infant is great for many avian 
species, but neither is it totally inclusive. 
The type of stimuli that attract mammalian 
species appears to be more specific. In dis- 
cussions of the type of stimuli that attract 
the infant, it has been Suggested that the 
types of stimuli leading to attraction will 
depend on environmental variables, and will 
not be identical for all Mammalian or all 
avian species, 

It should be noted that this Paper is con- 
cerned with the attraction and following 
aspects of imprinting, and when the word 
imprinting appears it is used in this context, 
The author’s presentation of imprinting is 
appropriate, even though he is not concerned 
with other forms such as sexual imprinting 
or with the “imprinting” of song in birds, 
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THE Process UNDERLYING THE INFANT} 
ATTRACTION TO THE MOTHER 


Comparative psychologists and ethologs: 
have gathered strong evidence that ertem 
stimuli are capable of inducing (equal i 
meaning to eliciting) responses in birds a 
mammals that are not simple reflexes, Blt 
ing stimuli are not what are generally oe 
sidered to be positive and negative reinforce: 
(e.g., food, water, and noxious stimuli s 
as shock and fire), and they are not requiri 
to be paired with reinforcers to be efiectix 
In addition, their effectiveness is not cont 
tional on prior experience with them. Meza 
Penick, and Beckett (1959) found, for & 
ample, that mallards with no previous expe 
ence with objects flying overhead exhibit 
fear responses to models of both a gus 
and hawk flying overhead in the first seve! 
test sessions. Certain stimuli also did 
approach and specific hormonal respons 
in mammals and birds (Lehrman, 196! 
Schneirla, 1959). Imprinting and the dot 
mother’s ability to attract a baby monkey 
are, of course, excellent examples of the it 
ducement of an approach response, The sit 
gestion made here is that in all instances ú 
the infant’s approach to the mother 1 
mother-surrogate already considered and # 
all yet undocumented cases of the infant 
attraction to the maternal stimulus the a 
Proach is mediated by the experiencing 
a pleasant feeling induced by the mother # 
mother-surrogate. Experimental support ft 
the hypothesis comes from experiments 
Harlow (1962b), Kovach and Hess (196i) 
and Mason and Berkson (1962). 

Harlow devised surrogate mothers thi! 
could, upon the experimenter’s wish, cait 
pain to the infant rhesus monkey resting 0 
it. One mother was designed so that com 
pressed air could be forced out through it 
ventral surface, but when the infant W 
exposed to this noxious stimulation, it “cll 
more and more tightly to the unworthy 
mother.” Young monkeys either continued ! 
hold onto or returned to two other clo 
mothers that inflicted pain. They held ont 
the mothers as strongly or more strongly that 
before the noxious stimulation was appliet 
Contemporary learning theory predicts that 
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the infant should avoid the mother (the 
mother is directly associated with pain), but 
if we regard the mother (or a component of 
ber) as a stimulus that produces a pleasant 
feeling, then the results are more understand- 
able. In the face of pain, the infant will be 
more motivated than ever to experience a 
pleasant feeling. This is why, instead of 
avoiding the mother, he holds on to her “more 


and more tightly.” 


In an experiment that has direct parallels 
with Harlow’s, Kovach and Hess (1963) 
stocked Vantress Broiler chicks in an im- 
printing situation 18 hours after they were 
batched. The chicks were exposed to a sta- 
tionary model for the first 10 minutes, and 
then, as the model moved intermittently 
about the runway, the chicks were shocked 
by electrodes attached to the wings for a 
total of 11 times. The procedure for the con- 


tol was identical, except that the shock 


was omitted. The experiment is similar to 
Harlow’s in that noxious stimulation is as- 
Slated with the presence of the surrogate 
mother in the experimental group. Kovach 
and Hess found that the shocked group fol- 
lowed the imprinting model significantly more 
than the control. The interpretation is that 
because of their greater emotional unrest, the 
shocked group was more motivated to per- 
ceive the stimulus characteristics of the sur- 
rogate mother that induced a pleasant feeling. 
Using the same age group and the same in- 
tensity, duration, and number of shocks, the 
above results were replicated. 

The experience of a pleasant feeling and 
i experience of emotional unrest are viewed 
35 bipolar extremes on a pleasant—unpleasant 
dimension, If emotional unrest is experienced, 
then a pleasant feeling is not, and vice versa. 
lt follows, then, that if two animals are 
Placed in an emotionally upsetting situation, 
the one permitted to perceive stimuli which 
dicit a pleasant feeling at the same time 

ould exhibit less distressed emotional be- 
avior, Mason and Berkson (1962) have 
Periormed such an experiment. They found 

t the percentage of vocalizations to shock 
Was significantly lower for chimps held with 

êt ventral surface pressed against the ex- 
Petimenter’s chest than for chimps placed on 
a bare surface, Since chimps normally hold 


145 


onto their mother’s ventral surface, it is likely 
that the stimuli associated with being held 
against the experimenter’s chest produced a 
pleasant feeling which operated against the 
effect of the shock. 

In the previous section of this article, 
it was concluded that only a certain more 
or less exclusive set of stimuli lead to the 
young being attracted to the mother. The 
suggestions made in this section are that: 
(a) the young’s attraction to the mother is 
mediated by the elicitation of a pleasant feel- 
ing by the more or less exclusive set of 
stimuli, (b) in no instance is the presence of 
what are normally called reinforcing stimuli 
necessary to the development of the off- 
spring’s attraction to the mother, (c) a pleas- 
ant feeling will reduce an unpleasant feeling 
such as fear or pain. (It may be that a 
pleasure derived from eating, copulating, and 
the like will act to reduce an unpleasant feel- 
ing, but with much less effect than the stimuli 
considered here.) 

The use of an intervening variable to ac- 
count for what can be operationally defined 
as the infant’s approach to a stimulus was 
made because the same stimulus can reduce 
emotional distress (as demonstrated in the 
Mason and Berkson study, and inferred from 
the Harlow, and Kovach and Hess studies). 
Operationally, the phenomena of approach 
and the reduction of emotional distress are 
totally independent; theoretically (by the 
use of an intervening variable) they are not. 
Additional examples of the mother’s ability 
to reduce an unpleasant feeling are given in 
the “Reduction of Fear” section. 


NATURE OF FEAR 


In order to understand the psychological 
significance of the mother in the development 
of the young, it is necessary to be fully 
aware of the wide prevalence of one of the 
causes of fear; that is, the fear that follows 
upon the perception of certain kinds of ex- 
ternal stimuli that are not painful and that 
have not been associated with painful stimuli 
in the past. 

The main theoretical treatment of fear 
produced by external stimuli of the above 
specification has been made by Hebb (1946). 
He hypothesized that a novel sensory event 
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cannot be integrated into an organized phase 
Sequence without initial disruption. A novel 
sensory event has no other place to go, how- 
ever, and thus will always tend toward inte- 
gration with the already established phase 
sequences. In the process of integration, a 
disorganization of phase sequences must 
initially occur, and fear is a product of the 
disorganization. 

If the sensory input to an organism is 
greatly restricted from birth, little by way 
of established phase sequences that are de- 
pendent on sensory input would exist. If novel 
sensory events were to be experienced, no 
attempt could be made to integrate them into 
existing phase sequences because little or no 
phase sequences would exist. As a result, no 
disruption would take place. Hebb predicts, 
therefore, that for the animal with a restricted 
sensory input from birth, the perception of 
novel sensory events would not lead to a fear 
response. This prediction is not true. The 
facts are that animals raised under restricted 
conditions from shortly after birth and ex- 
posed to a strange environment in adulthood 
are extremely fearful, much more so than 
normally-raised infants (Harlow & Harlow, 
1962; Katz, 1953; Mason & Green, 1962; 
Menzel, Davenport, & Rogers, 1963a, 1963b). 

Menzel, Davenport, and Rogers (1963a) 
separated chimpanzees from their mothers 
within the first day of life and placed them 
in small cubicles that were impermeable to 
outside light and cleaned only once a week. 
There were four subgroups of mother- 
separated animals: a social-added group (two 
cubicles placed together, subjects, Ss, sepa- 
rated only by a set of bars), a manipulation- 
added group (Ss given the Opportunity to 
manipulate a lever and switches), a visual- 
added group (designs and pictures shown on 
the wall by a slide projector), and a maxim- 
ally-restricted group (bare walls of the cubi- 
cle). It can be seen that the variety of sen- 
sory stimulation ranged from moderate to 
highly restricted, At about 22 months, Ss 
were placed in the testing apparatus which 
consisted of a standard rearing cubicle with 
a place for visual stimuli and objects to be 
presented. As compared to the controls, all 
four of the restricted groups showed ‘“inces- 
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sant stereotypy ... and timidity towa 


novel objects.” 

In a second experiment, the same Ss wer 
put in an enclosed room that contained nise 
objects (Menzel, Davenport, & Rogers, 
1963b). The restricted Ss exhibited “stere 
typed responses, preferences for certain bodily 
Positions (especially prone or supine) . ., 
[and] initial avoidance of most objects an 
prolonged distress in most novel situations’ 
As in the case of the first experiment, th 
differences among the four mother-separatel 
groups were slight or nonexistent, while th 
differences between the mother-separated ani 
feral controls were great. 

Mason and Green (1962) separated infant 
rhesus monkeys from their mothers within 12 
hours after birth, and placed the infants 
individual cages where they were permitted 
to see and hear but not to contact other 
monkeys. At a mean age of 17.9 months, 
they were placed alone in a 12 by 14 foot 
room which was empty except for four objects 
hung on the wall. Mason and Green write 
that the “restricted monkeys crouched, sucked 
thumbs or toes, clasped themselves, and en- 
gaged in rocking or other stereotyped repeti- 
tive behaviors. None of these responses were 
observed in the feral group.” Using the same 
species, Harlow and Harlow (1962) sepa 
rated infants from their mothers and placed 
them in individual cubicles with solid walls. 
After either 6 months or 2 years of isolation, 
they were placed in a social situation, They 
were extremely fearful and typically ex 
hibited marked stereotyped behavior, Upon 
the approach of another animal, they would 
crouch or flee. 

Developing in a fairly complex and changing 
environment apparently does not lead to an 
adaptive response to novel stimuli. Both the 
Mason and Green restricted group and the 
Menzel, Davenport, and Rogers social-added 
group exhibited extreme fear upon exposure 
to a novel environment even though the en- 
vironments in which they were reared were 
moderately rich in stimulation, Both groups 
were permitted to see and hear others of their 
own species who, no doubt, exhibited many 
variations in posture, movement, and vocali- 
zation. The maximally-restricted groups of the 
Menzel et al. studies and the Harlow and ' 
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Harlow experiment demonstrate, on the other 
band, that fear results even with stringent 
jwlation, contradicting Hebb’s hypothesis. 
Considering all the early-isolation studies to- 
gether, they indicate that the amount of fear 
is not a function of the variety of sensory 
stimuli the infant is exposed to after birth 
(given that the infant is not allowed to per- 
ceive stimuli which induce a pleasant feeling). 
The experiments do, however, support the 
position that novel stimuli (be it a strange 
room, a novel object, or a moving animal) 
elicit fear. 

An experiment with an avian species has 
shown the same results. Bruckner, according 
to Katz (1953), separated chicks from the 
mother and housed them individually. When 
the isolated chicks were brought into contact 
with peers of their same age, they were 
“overcome by fear, bewildered, and hopeless 
[and] they ran up and down the opposite 
wall in a search of an outlet.” 

The fear-inducing capacity of novel stimuli 
is also expected to be observable in animals 
that are not separated from their mother. 
Bayley (1932), for example, analyzed the 
different causes of crying in children brought 
in for monthly psychological examinations. 
One cause was considered to be the strange- 
ness of the place and of the people doing the 
Psychological testing. Arsenian (1943) ob- 
served that children (mean age 16.1 months) 
placed in a strange but otherwise pleasant 
toom that contained toys and pictures cried 
and exhibited autistic behavior. The 6- and 
Tmonth-old child’s fear of strangers has been 
mentioned by many authors. Rhesus monkeys 
raised on cloth or wire mothers exhibit fear 
of both an unfamiliar object when it is intro- 
duced into the home cage and of a strange 
‘oom with several objects spread around it 
(Harlow & Zimmerman, 1959). Scott (1963) 
indicates that dogs whine when placed in a 
novel environment. Freedman, King, and 
Elliot (1961) observed that pups raised by 
their mothers and away from humans are, 
Starting at the age of 5 weeks, fearful of a 
Quietly sitting human. A strong fear of 
Sounds and sights leading to violent attempts 
to flee has been observed in young mice 
(Williams & Scott, 1954). 

Observations that novel stimuli also elicit 
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fear in young birds have been made. Canvas- 
back and Redhead ducklings emitted con- 
tinuous distress calls when placed alone in a 
strange environment (Collias & Collias, 
1956) and 78% of chicks initially exposed 
to an imprinting object at 54-60 hours after 
hatching exhibited fear responses (Jaynes, 
1957). In addition, Moltz (1960) writes 
that: 


Beginning at approximately 25 or 30 hours from 
the occurrence of hatching, precocial birds will fre- 
quently exhibit “anxiety” or “fear” in response to 
unfamiliar aspects of the environment. Ramsay and 
Hess (1954), for example, note that “fear responses” 
were a characteristic feature of mallard behavior 
and Fabricius [1951] reported the same phenomenon 
in tufted ducks and eiders, Hinde, Thorpe, and Vince 
(1956) speak of the fleeing response in moorhens 
and Jaynes [1958] found the same behavior in the 
domestic chick. The writer has also observed that 
the Peking duck responds in a similar manner to a 
wide variety of unfamiliar stimuli [pp. 300-301]. 


Stimuli that elicit fear must be described 
in terms of novelty. Mammals and birds fear 
strange rooms (Mason & Green, 1962; 
Menzel et al., 1963b), strange compartments 
(Collias & Collias, 1956), strange objects 
(Harlow & Zimmerman, 1959; Jaynes, 1957), 
strange animals (Bruckner, in Katz, 1953; 
Freedman et al., 1961; Harlow & Harlow, 
1962), and even an unfamiliar position of 
a familiar object (Melzack et al., 1959). In 
the above quotation, Moltz indicates that 
unfamiliar stimuli elicit fear, and Menzel et 
al. (1963a) write that “for most of these 
naive chimpanzees novelty rather than specific 
object qualities was the prepotent determiner 
of response, evoking a generalized caution 
which adapted only slightly even over hours 
of object exposure.” 

A working hypothesis is that fear increases 
with an increase in the proportion of novel 
to the total sample of stimulus elements per- 
ceived: that is, initial fear = f (novel stimu- 
lus elements/total stimulus elements) where f 
indicates a monotonically but not necessarily 
linearly increasing function. Novelty has its 
customary definition. The more novel a stimu- 
lus is, the less frequently and the less recently 
it has been perceived in the past. The mean- 
ing of stimulus elements corresponds to that 


of Estes (1959). 


Placing a familiar object in an unfamiliar 
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position can be conceived as producing novel 
stimulus elements. A moving object such as 
a wriggling snake is a rich source of novel 
stimulus elements because the relation of one 
subset of stimuli to a second subset is not 
constant, Novelty should most likely also be 
described in terms of subsets of stimuli along 
quantitative and qualitative dimensions. A 
sudden loud sound in a normally quiet en- 
vironment would be expected, for example, to 
be more novel than the same sound in a more 
noisy environment. 

The fact that the natural and surrogate 
mothers are, at first exposure, also unfamiliar 
stimuli makes one wonder why they too do 
not elicit fear. They do not because the elici- 
tation of fear by novel stimuli is a matura- 
tional phenomenon. There is a period after 
both birth in mammals and hatching in birds 
in which novel stimuli fail to elicit fear re- 
sponses; the ability of novel stimuli to elicit 
fear comes later, and apparently develops 
gradually, 

In support of this statement, Bayley 
(1932) did not observe children’s crying due 
to strange situations until 2 months, and from 
this point on it developed gradually. Fear of 
strangers did not appear until 6 or 7 months, 
Harlow (1962a) wrote that the neonatal 
monkey investigates “most external stimuli, 
including mechanical monsters, that leave an 
older animal in a state of abject terror.” Scott 
(1963) reported that fear of strange places in 
dogs first occurs at 3 weeks and increases to 
a maximum at about 6 or 7 weeks of age. 
Williams and Scott (1954) observed that fear 
responses to sounds and sights did not occur 
until the primary socialization period. 

As for birds, Moltz ( 1960) implies in the 
previously cited quotation that fear of 
strange stimuli does not take place before 
25 or 30 hours for several species. Jaynes 
(1957) also found fear responses to un- 
familiar stimuli gradually developing some 
time after hatching, 


REDUCTION OF FEAR BY THE MOTHER 


It has already been suggested that some 
components of the stimulus complex consti- 
tuting the mother or surrogate mother elicit 
a pleasant feeling in the young. This section 
essentially consists of more examples of the 
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mother’s ability to reduce emotional distres 
in her offspring. 

Arsenian (1943) found that children æ 
companied by their mother in a strange roe 
scored higher on a security scale than w 
accompanied children. In observations d 
child-mother interactions, Ainsworth (1963) 
described the 6-month-old child as using tt 
mother as a base for exploration. What i 
happening is that the child explores the e- 
vironment, and as he moves away from his 
mother the proportion of novel stimuli gradi- 
ally becomes greater until fear is elicited; a 
this point the child returns to the vicinity 
of his mother. 

The fear-reducing ability of the mother i 
particularly evident in Harlow and Zimmer 
man’s (1959) research. In one experiment, 
they introduced a strange object into th 
home cage of monkeys raised on single cloth 
surrogate mothers. The infants rushed to the 
mothers and clung to them. A gradual sub- 
sidence of fear took place. In another experi- 
ment, monkeys raised on cloth mothers wert 
placed in a strange room containing several 
objects and the surrogate mother. The infants 
rushed to her, and clung to her. Gradually the 
fear subsided, and the monkeys began invest 
gating the objects placed around the room, 
using the mother as a base for exploration 

The effectiveness of the mother goat il 
reducing fear in her offspring has been demon- 
strated by Liddell (1954). Two pair of $ 
week-old twins were administered an aversivé 
classical conditioning procedure; one member 
of each pair of twins underwent condition- 
ing alone, the other in the presence of the 
mother. The group of twins conditioned with 
the mother absent developed a neurotic pit 
tern of behavior consisting of an increasing 
inhibition of movement in the 2-minute i 
terval between the CSs for shock. They 
crouched against one wall and finally cowered 
in a corner, The twins conditioned with the 
mother present continued to move freely 
through the enclosure throughout the condi- 
tioning procedure, 

There is evidence that mother or surrogate 
mother birds also reduce fear. Collias an 
Collias (1956) have observed that when 
Canvasback and Redhead ducklings return 
to the mother or to the rest of the brood 
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their fear was eliminated (since the stimuli 
leading to imprinting are nonspecific, the 
brood can also elicit a pleasant feeling). 
Temporary separation from the mother should 
leave the young bird vulnerable to novel 
stimuli because stimuli which elicit a pleasant 
feeling are not available. It is not unexpected, 
therefore, that many species of young birds 
react to separation from the mother or brood 
with distress calls (Collias, 1952). Moltz 
(1963) states that several investigators have 
noticed that “distress calls and startle re- 
sponses are invariably displayed whenever a 
bird is in the apparatus but not in the proxim- 
ity of the imprinting object,” and that when 
the bird is approaching or following the object 
fear is absent. In addition, Moltz, Rosenblum, 
and Halikus (1959) have demonstrated that 
Peking ducklings follow objects more strongly 
if the imprinting apparatus has been made 
a CS for shock. The interpretation of this 
result is that ducklings made fearful by a 
CS for shock are more motivated to perceive 
stimuli that elicit a pleasant feeling and 
therefore follow the imprinting object more 
assiduously, 

It might be inferred from the foregoing 
that the mother reduces fear, but it is not 
so much the mother as certain critical com- 
ponent stimuli that are effective. Infant 
thesus monkeys raised with wire mothers, for 
example, react with crouching, rocking, and 
other sterotyped behaviors upon the place- 
ment of a fear stimulus in the home cage 
even though the wire mother is present. Simi- 
lar results occur when the young monkey is 
placed in a strange room with its wire mother 
Present (Harlow & Zimmerman, 1959). In 
the case of birds, Collias and Collias (1956) 
write that: “if the person being followed sud- 
denly ceases all sound and movement, a day- 
old duckling will at once become lost and give 
lts distress call, even though it happens to be 
Perched on the shoe of the substitute parent.” 

Strong additional evidence that the mother 
reduces fear of novel stimuli comes from the 
Previously discussed experiments of Mason 
and Green; Harlow and Harlow; Menzel, 
Davenport, and Rogers; and Bruckner. In 

of these studies, the infants were separated 
tom their mother at birth or shortly after it. 
Without the presence of the mother, there 
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was no way for the fear of unfamiliar stimuli 
to be reduced—and indeed it never was. 

In general, fear reduction will not take 
place as long as the young are reared in an 
environment that does not contain stimuli 
that elicit a pleasant feeling—no matter how 
rich the environment may be. Harlow and 
Harlow (1962) raised infant monkeys to- 
gether and found their response to novel 
stimuli relatively normal. Probably the “choo- 
choo” linkage that the young form among 
themselves induces a pleasant feeling; that is, 
the tactual stimulation of the young holding 
on to each other is sufficiently similar to the 
stimulation involved in a single monkey hold- 
ing on to its mother that a pleasant feeling 
and fear reduction ensue. If the young of 
some other species are raised together or if 
the young of one species are raised by the 
mother of a second species, the inducement 
of a pleasant feeling and the reduction of 
fear will occur to the extent that the stimuli 
of the siblings or surrogate mother resemble 
the stimuli of the natural mother that elicit 
a pleasant feeling. 

There is one more important point to be 
made about the fear-reducing ability of the 
mother. Examples have been given of the 
mother reducing the fear the infant has of a 
given situation. Upon exposure to the same 
situation when not in the mother’s pres- 
ence, does a full-blown fear response occur 
once more, or does some mechanism exist 
whereby the relevant component stimuli 
of the mother permanently reduce the fear of 
novel stimuli? That she must be able to per- 
manently reduce fear of novel stimuli is evi- 
dent, because if she could not, then all adult 
mammals and birds would be about as fearful 
of novel stimuli as the subjects in the mother- 
separated studies. There is, however, only one 
demonstration in the literature that the 
mother permanently reduces the fear of a 
given situation. 

Liddell (1954) placed the goat twins that 
underwent conditioning with shock at 3 weeks 
of age back in the identical experimental situ- 
ation when they were 2 years old, and insti- 
tuted conditioning once more. The group 
originally conditioned without the presence of 
the mother became inhibited in their move- 
ment and exhibited neurotic-like behavior; 
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the twins conditioned with the mother pres- 
ent, however, continued to move freely 
about the room. Liddell has conditioned many 
adult goats in similar situations, and they 
have been consistently found to exhibit 
neurotic-like behavior. It is evident, then, 
that the presence of the mother was respon- 
sible for a permanent reduction in the fear of 
the situation associated with shock. 

There is a second way in which fear of 
unfamiliar stimuli is reduced, because habitu- 
ation of the fear response to novel stimuli 
has been found to occur (Hinde, 1954; Moltz, 
Rosenblum, & Halikas, 1959). Habituation 
does not appear to be, however, an effective 
way of reducing fear of novel stimuli. In the 
earlier quotation, Menzel et al. (1963a) indi- 
cated that the generalized caution in the test- 
ing situation adapted slowly over hours of 
object exposure. Although testing was con- 
tinued for several days in both the Menzel 
et al. studies and the Mason and Green study, 
no analysis was made of a day by day de- 
crease in fearful behavior. In addition, Har- 
low and Zimmerman (1959) never mention a 
subsidence in the fear responses of young 
monkeys left without a mother or with a 
wire mother in a fear-eliciting situation. 
Finally, dogs exposed to humans for the first 
time at 14 weeks of age and tested over a 
2-week period were as fearful on the last as 
on the first day of testing, One animal was 
petted and fondled over a 3-month period, 
but it showed only a slight decrease in fear 
(Freedman et al., 1961). 

An expression for the initial fear of a situ- 
ation has appeared earlier in this paper. Be- 
cause habituation occurs upon continued or 
subsequent exposure to a given situation, it 
follows that subsequent fear will be slightly 
to moderately less than initial fear. If the 
young animal was initially accompanied by 
its mother, however, subsequent exposure will 
lead to a greatly reduced fear. 


A Brier Loox at THE FURTHER 
DEVELOPMENT OF THE INFANT- 
MOTHER RELATIONSHIP 


Evidence for the existence of a critical 
socialization period in mammals and birds has 
been summarized by Scott (1962). The termi- 
nation of the period is commonly hypothe- 


DONALD LEE KING 


sized to be due to the offspring’s fear of the 
novelty of the mother or surrogate mother 
attendant upon its first perception of be. 
(The maternal object does not elicit fer 
shortly after birth because fear of novd 
stimuli appears at a more advanced stage of 
maturation.) One must ask, however, if at 
the end of the critical socialization period 
the set of stimuli that induced a pleasant 
feeling in the more immature animal continue 
to do so. In other words, at the termination 
of the period are both a pleasant feeling and 
a fear response simultaneously elicited by dif- 
ferent stimulus qualities or parameters of the 
mother? 

Freedman et al. (1961) found that dog 
exposed to a sitting, passive, human figure at 
9 weeks of age first avoided him. When tested 
for attraction to a handler at 14 weeks, these 
same pups scored as high in attraction as 
pups exposed to the human at 3, 5, or 1 
weeks. In addition, Jaynes (1957) found that 
imprinting could take place after the critical 
period if prolonged exposure to the imprinting 
object took place. 

Both of these experiments can be inter- 
preted in the following way: the young show 
fear and initial avoidance because of the 
novelty of the surrogate mother. Then, be- 
cause they remain in the vicinity of the sur 
rogate mother, they come to perceive the 
stimuli that elicit a pleasant feeling, These 
stimuli reduce the fear of the novel stimuli, 
and the young animal comes to be as strongly 
attracted to the surrogate mother as an animal 
exposed before the period is terminated. 

Young are generally motivated to remain 
near their mothers long after the critical 
socialization period has passed. Two possible 
reasons for this are: (a) maternal stimuli 
that originally elicited a pleasant feeling con- 
tinue to do so, and (b) the reinforcement 
associated with the mother’s ability to elicit 
a pleasant feeling (and the concommitant feat 
reduction) makes her a positive secondary 
reinforcer, so that the infant will continue 
to be attracted to her even though the mother 
no longer induces the pleasant feeling. The 
first alternative gains support from the inter- 
pretation of the Freedman et al. and Jaynes 
experiments just given. The second alterna- 
tive is the same as Moltz’s (1963) explana- 
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ton for the young’s continued attraction to 
the imprinting object, except that he posits 
a single stimulus parameter common to all 
avian species as the inducer of the pleasant 


Evidence that a continued attraction to the 
mother is not a result of continued associa- 
tions of the mother with food reinforcement 
comes from Harlow and Zimmerman (1959). 
They reared one group of monkeys in the 
presence of dual mothers, cloth and wire; 
both provided milk. A second group grew up 
with a wire mother that provided milk; a third 
with a cloth mother that did not provide milk 
(this group was fed by hand at first). The 
composite emotional index of ‘‘dual-fed 
raised” monkeys in an open-field test at 145 
days of age with the wire mother present was 
higher than in the case where no surrogate 
mother was present; the index was lower than 
the control when the cloth mother was pres- 
ent. Dual-fed raised monkeys also spent much 
more time in contact with their cloth mothers 
than with their wire ones in the open-field 
situation. As for “single wire fed” monkeys, 
they also exhibited more fear and less contact 
with their wire mother than their “single cloth 
nonfed” counterparts. 

Although the young animal remains strongly 
attracted to the mother after the critical 
period, it is common knowledge that at some 
point in its development it begins to become 
less dependent on her. Originally the young 
are dependent on the mother for her ability 
to reduce fear of novel stimuli. Novel stimuli 
are a pervasive phenomena in a young ani- 
mal’s life, and without the mother its fear 
Would be overwhelming. The young will re- 
man dependent on the mother, therefore, 
Until their fear of a sufficient number of novel 
stimuli is reduced, The thought that the origi- 
mal and fundamental source of dependency 
in humans is the child’s need for the reduction 
of his fear of novel stimuli by the mother 

hot yet been clearly expressed in the 
child-psychology literature. 

What factor, then, determines the extent to 
t ch a young animal’s fear of novel stimuli 
as been reduced? The factor is the variety 
‘nd number of novel stimuli that the infant 

4s been exposed to when in the presence of 

mother, Without the mother’s presence, 
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the fear of novel stimuli will never be reduced, 
and the infant will never attain independence 
But even if the mother and infant are placed 
together continuously, however, the infant will 
never attain independence if it is not permit- 
ted to perceive novel stimuli. Without the 
perception of novel stimuli, there is simply 
no opportunity for fear reduction to take 
place. 


A FUNCTIONAL INTERPRETATION 


There are two phenomena that are hypothe- 
sized to be common across all species of mam- 
mals and birds: (a) that stimuli exist which 
elicit a pleasant feeling in the young and lead 
to the permanent reduction in fear of novel 
stimuli, and (ġ) that novel stimuli elicit fear 
responses in both young and adult organisms. 
Because birds and mammals are identical in 
these two respects, a common survival prob- 
lem is suspected (the reasoning being that, in 
general, similar behaviors will tend to be 
found in species facing identical survival 
problems). The survival problem will first be 
described, and then it will be demonstrated 
how the two phenomena spelled out above 
solve this problem. 

Birds and mammals must spend at least a 
small portion of their life traveling over and 
exploring the environment. Nutritive and 
possibly reproductive needs require this. It is 
important to realize that in the act of ventur- 
ing forth, in the act of exploring the environ- 
ment, predators and inanimate things such as 
cliffs and turbulent rivers endanger the sur- 
vival of the species. Yet these dangers must 
be chanced in order that other needs be satis- 
fied. This is the problem that faces all birds 
and mammals. What is required, therefore, is 
a mechanism by which dangerous stimuli can 
first be perceived, and then avoided. The two 
phenomena that are hypothesized to be com- 
mon to all avian species and mammalian spe- 
cies constitute the mechanism. t 

What is there inherent in a potentially 
dangerous situation that can serve as a cue 
for an avoidance response? The young organ- 
ism is exposed to a wide variety of stimuli 
that are not dangerous while it remains de- 
pendent on the mother. These stimuli no 
longer elicit fear and avoidance. Now, exactly 
what types of stimuli has the young organism 
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definitely not been exposed to? What types of 
stimuli will the mother definitely not lead her 
offspring to perceive? Clearly, the answer is 
those stimuli that are dangerous or potentially 
dangerous. If these stimuli are frequently per- 
ceived the survival of the species is endan- 
gered, and therefore it is reasonable to believe 
that the mother must avoid them. It follows, 
then, that the young do not perceive dangerous 
or potentially dangerous stimuli either; by 
definition, therefore, dangerous and poten- 
tially dangerous stimuli are also novel, Be- 
cause potentially dangerous stimuli are novel 
stimuli, and since novel stimuli elicit fear, 
the novel stimuli serve as cues that both indi- 
cate a dangerous situation and lead to avoid- 
ance of it. In this way the dilemma facing all 
birds and mammals is solved. 

Because the young are attracted to the 
mother and the mother to the young (Rhein- 
gold, 1963), the young and mother stay close 
together. When the mother perceives a dan- 
gerous situation she will avoid it, and the off- 
spring will either follow, be herded along by 
her, or disperse or hide upon the mother’s 
giving a warning signal. Offspring will, how- 
ever, often explore the environment on their 
own, but they are encountering novel stimuli 
which are not dangerous because they still 
remain in the mother’s vicinity. Moreover, if 
they encounter a high proportion of novel 
stimulus elements fear reduction will not take 
place because: (a) the fear-reducing ability 
of the mother is not present, and (b) quick 
avoidance will occur so that there is no time 
for habituation to take place. The general 
principle is that the young’s fear of novel 
stimuli that are dangerous will not be reduced, 
From this principle, it follows that for the 
adult organism the novelty of a stimulus is 
correlated with its dangerousness, 

Imagine a young deer with its mother. The 
mother will not approach and will tend to 
avoid the potentially dangerous stimulus of the 
smell of a nearby wolf. The infant will never 
have the fear of the novel stimulus of a nearby 
wolf smell reduced by its mother because it 
will rarely, if ever, perceive the smell as long 
as it remains with the mother, When the young 
deer matures, therefore, the wolf smell is a 
novel stimulus. Fear will be elicited, and the 
smell will be avoided. The deer is not greatly 
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endangered, because the wolf is avoided em 
though it is still far off. If the wolf approaches 
upwind and if the wind doesn’t shift on hia 
however, he will succeed in getting closer i 
the deer. At this point, novel visual and aw 
tory stimuli that the wolf might supply wi 
serve as cues for avoidance. If and when t 
mother perceives these cues, she will remo 
her offspring from them quickly enough $ 
that fear reduction does not take place, Whe 
the young deer becomes an adult, it will avci 
the sounds and sights of an approaching wdi 
because they are novel and elicit fear axl 
avoidance. The same mechanism applys t 
dangerous inanimate things, The sound stin- 
uli of an overflowing, rampaging river or the 
smell stimuli of a forest fire are novel to the 
adult, and will therefore be avoided. 

It has been said that initial fear is a mont 
tonically increasing function of the proportio 
of novel stimulus elements to the total numb 
of stimulus elements sampled. When a smal 
proportion of novel stimulus elements is pe 
ceived, caution occurs. The organism appeal 
to be alert. Perhaps it is activating all rece} 
tor pathways for the perception of novi 
stimuli. Small proportions of novel stimuli 
elements occur frequently, but it would bè 
nonfunctional for the animal to reverse if 
tracks too often. Two things probably hay 
pen when small proportions of novel stin 
ulus elements are perceived: (a) investigativ 
and other drives push the animal forwards, 
and (0) habituation takes place, and forwatl 
Progress is resumed. If the proportion it 
creases, the animal prudently retreats. If tht 
Proportion increases greatly, a full-blown fea! 
response is elicited, and the animal flees {rol 
the situation. The situation in which the adult 
organism perceives an approximately maxi 
Proportion of novel stimulus elements is th? 
sight, sound, and smell of a predator in the 
act of attacking it. Extreme fear is elicited, 
and the animal makes a violent attempt 10 
escape from the very large proportion of nove 
stimuli. 

In adulthood, extensive reduction of the fea! 
of dangerous stimuli via habituation does not 
take place for two reasons, Firstly, truly dat- 
gerous stimuli are usually novel, so that avoit- 
ance occurs before habituation ensues. Sé 
ondly, habituation is not an effective way 0 
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reducing the initial fear of a situation (see 
page 151). 

One way of solving the dilemma involved 
in the need to explore the environment and 
the danger inherent in doing so has been indi- 
cated. Actually, the number of adequate solu- 
tions of the problem is small. In order to con- 
firm this last statement, alternate solutions 
can be invented and tested for their adequacy. 
While trying out alternate solutions, the dis- 
tinction made between potentially dangerous 
stimuli (such as the smell, sound, and sight of 
awolf that is far away) and truly dangerous 
ones (a nearby wolf) will be maintained. An 
appropriate solution requires the avoidance of 
potentially as well as truly dangerous stimuli. 
One possible solution is for organisms to have 
a built-in fear of stimuli that will be danger- 
ous, but since the number of truly dangerous 
stimuli is usually very great and the number 
of potentially dangerous stimuli much greater 
still, the brain could not contain an adequate 
number of built-in mechanisms. A second pos- 
sibility is to have all novel stimuli elicit fear. 
But this solution would not be satisfactory 
because the number of novel stimuli would be 
so large that fear would overwhelm the organ- 
ism. To get around the problem of an over- 
Whelming fear, novel stimulus elements (no 
matter how many there are) can be made to 
elicit only small amounts of fear. But this 
solution is not satisfactory either. Very novel 
stimuli such as a pouncing predator would 
dicit a small amount of fear but what good 
Would caution do in this situation? Another 
Solution is to let the animal quickly and ex- 
tensively habituate to novel stimuli. With this 
solution, however, habituation to dangerous 
stimuli would occur and the species would not 
Survive, 

Another possibility (and the traditional, yet 
undocumented, one) is that the young learn to 
avoid dangers by being exposed to their own 
mother’s reactions to the dangers; somehow 
Mothers teach their offspring to avoid danger- 
ous stimuli or somehow the young learn to 
Mutate their mother’s behavior to dangerous 
stimuli, In the first place, if the stimulus is 
dangerous, the mother will most often remove 
er offspring from the situation. The extent of 
time that the infant clearly perceives the dan- 
serous stimulus is therefore very small—small 
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enough, in fact, that the chance that the infant 
learns some sort of connection between the 
mother’s changed emotional state or changed 
behavior and the dangerous stimulus is very 
slight. Secondly, it is not at all clear how, if 
the mother is afraid of a certain stimulus, the 
infant will learn to be afraid of that same 
stimulus (one could, of course, invent a learn- 
ing paradigm to predict this—but the predic- 
tion is no guarantee that the phenomenon oc- 
curs). Thirdly, the suggestion that the young 
learn their mother’s response to a dangerous 
stimulus implies that the mother is fairly 
frequently exposed to dangerous situations— 
an event which, if true, would endanger the 
survival of the species. Fourthly, this learning- 
from-the-mother hypothesis raises the possi- 
bility that the offspring will not avoid dan- 
gerous stimuli that they have not been ex- 
posed to in the presence of the mother, again 
implying the eventual extinction of the spe- 
cies. 

The infant-mother relationship is also a 
mechanism by which similar behaviors can be 
transferred from one generation to another by 
experiential factors and not by genetic trans- 
mission. Because the young remain close to 
the mother, the same stimuli that are familiar 
to the mother will also be familiar to them. 
When the offspring become independent of 
the mother, they will tend to avoid all novel 
stimuli, and therefore no additions to the 
population of stimuli that are familiar to the 
offspring can be made. The stimuli that were 
familiar to the mother will therefore remain 
as the only stimuli that are familiar to the 
offspring throughout their lifetime. Thus, the 
offspring of the offspring will be exposed to 
the same familiar stimuli as their mother, 
which are the same stimuli that were familiar 
to their mother’s mother, and the cycle will 
continue indefinitely. 


SUMMARY 


Two basic hypotheses have been made: 
that in all birds and mammals (¢) certain 
stimuli elicit a pleasant feeling and lead to the 
permanent reduction of fear of novel stimuli, 
and (b) novel stimuli elicit fear in young and 
adult alike. The stimuli that elicit a pleasant 
feeling, in the person of the mother, lead to 
the permanent fear reduction of the stimuli 
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that the mother encounters. Fear reduction of 
a sufficient number of novel stimuli permits 
independent exploratory behavior. Those 
stimuli not perceived by the young in the 
presence of the mother will, since they are 
novel, elicit fear and avoidance in the adult 
organism. This is a functional state of affairs, 
because, for the adult, there is a positive cor- 
relation between the novelty of a situation 
and its dangerousness. 
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RESEARCH WITH THE STANFORD-BINET, 
FORM L-M: 


THE FIRST FIVE YEARS 


PHILIP HIMELSTEIN 


Texas Western College of the University of Texas 


Research literature on the Stanford Binet (S-B), Form L-M, is reviewed 
beginning with 1960, the year of publication of the latest revision. Major 
areas include validity, reliability, sensitivity to extratest influences, performance 
of southern Negro children, influence of socioeconomic status, diagnostic value, 
and brief forms. Almost all validity studies are concerned with concurrent 
validity, in which the S-B is correlated with another test. 


The appearance of a new edition of the 
Stanford-Binet (S-B) has always been a land- 
mark in the history of clinical psychology and 
in the intelligence-testing movement. The 
1916 edition practically shaped the format 
and style of subsequent scales designed for 
individual testing. The 1937 revision, until 
the appearance of the Wechsler Intelligence 
Scale for Children (WISC), was unrivaled in 
popularity for use with children. In 1960, the 
third revision of the S-B, Form L-M, replaced 
the two forms of the 1937 edition, Form L 
and Form M. 

The third revision has had ample time to 
find its way into clinics, schools, and hospitals 
and to be subjected to scrutiny in a wide 
variety of studies and experiments. The S-B, 
while always popular as an instrument for 
clinical practice and for research, has never 
been the subject of the comprehensive reviews 
that the Wechsler scales have enjoyed. The 
WISC, for example, has been reviewed by 
Littell (1960) and the Wechsler Adult Intel- 
ligence Scale (WAIS) by Guertin, Rabin, 
Frank, and Ladd (1962). The purpose of this 
paper is to provide a similar survey of re- 
search for the latest S-B revision, 5 years 
after its appearance on the testing scene. To 
a considerable extent, this paper will conform 
to the format for reviewing test research 
adopted by Littell (1960). 

In addition to restriction in competency of 
any reviewer to locate every article on the 
subject of the S-B, there were some difficulties 
created by the writers of these articles that 
rendered them useless for this review. Per- 
haps the chief difficulty was the failure to indi- 


cate which form (L-M, L, or M) was et 
ployed in a particular study. During th 
transition from Forms L and M to the latei 
revision, the need to specify the form e 
ployed in a study apparently was not fuli 
appreciated. A second difficulty was a tent 
ency to combine all three forms, the ti 
from 1937 and the one from 1960, for t 
search purposes without analyzing the result 
separately for the various forms. Studies thit 
fell in either of these two categories Wet 
excluded from this review. 


THE SCALE 


Form L-M, the third revision of the $2 
combines the best subtests from Forms L atl 
M of the 1937 edition into a single scale. 
This means that there are no alternate forns 
for retesting purposes. The manual (Termat 
& Merrill, 1960), however, points out thé! 
with the development of additional individ 
tests, alternate forms are not critical 0 
longitudinal research, Form L-M is not thè 
result of an extensive standardization pit 
cedure involving normative, reliability, am! 
validity studies, It results, rather, from a! 
item-difficulty analysis of test records admit 
istered from 1950 through 1954. As in eatlit 
editions of the S-B, the placement of an item 
in the scale is determined by the percentag? 
Passing at each successive age level. Tit 
assessment group consisted of children be 
tween the ages of 2 years 6 months and 18 
This group, while not a representative sample 
of school children, made possible an examine 
tion of regional and socioeconomic factors i! 
subtest difficulty, 
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The important difference between the latest 
revision and the previous editions is the 
change to a standard score or deviation IQ 
(DIQ) rather than the ratio IQ. The new 
standard score has a mean of 100 and a stand- 
ard deviation of 16 for all age groups. One 
of the principal defects of the earlier scales 
bad been fluctuations in mean IQs and stand- 
ard deviations as a function of age. This has 
now been corrected by the use of the new 
system, Increased stability in IQ as a result 
of the use of the DIQ is reported by Pinneau 
(1959). 

Another important change is the extension 
of the IQ tables to include ages 17 and 18, 
rather than the limit of age 16 found in pre- 
vious Binet tables. The new ceiling may still 
te inadequate when testing gifted adolescents. 
The study by Kennedy, Nelson, Lindner, 
Moon, and Turner (1960), employing 11th- 
grade adolescents enrolled in a National Sci- 
énce Foundation Summer Mathematics Insti- 
tute, reported that the maximal level could 
not be determined for these subjects. An addi- 
tional defect noted in the latest revision is 
that abstract verbal items appear at too low 

_ alevel in the test and rote memory items are 
placed too high on the scale (Kennedy, Van 
de Reit, & White, 1963a). 


VALIDITY OF THE THIRD REVISION 


The vast bulk of the studies investigating 
the S-B would be classified as studies of con- 
current validity, which is concerned with the 
agreement of a test with a criterion obtained 
at very nearly the same time as the adminis- 
tation of the test (Cronbach, 1960, p. 104). 
There seem to be no studies in which the S-B 

_ Was employed for the prediction of behavior, 
tor have studies involving the construct valid- 
ly of the S-B appeared. Sattler’s (1965) 
sudy, an analysis of the item content, can 

Classified as an investigation of “content 
validity.” Most of the studies that are related 
‘0 the question of validity for the third re- 
Vision involve correlations of the S-B with 
other intelligence tests, both individual and 


ae and with achievement tests and teacher 
ings, 


Second Revision, S-B 


Ih view of almost a quarter century of 


| familiarity with the 1937 revision of S-B, it 
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is no surprise that efforts would be made 
to compare the old and the new revisions, 

In one of the earliest studies with the new 
instrument, Estes, Curtis, DeBurger, and 
Denny (1961) correlated Form L-M with a 
variety of intelligence tests including the 1937 
S-B. This study took the precaution of con- 
verting the 1937 S-B scores to DIQs. The 
subjects, drawn from a university training 
school and with a mean IQ of 123, are not 
representative of school children in general, 
The correlation between the two S-B IQs was 
.82. Prediction of performance from one Binet 
scale to the other varied according to intel- 
lectual level, with a significant, discrepancy 
between 1937 and 1960 S-B IQs at the Very 
Superior level. For this intellectual group, 
the mean IQ on the 1937 S-B was eight IQ 
points higher than on the 1960 S-B. 

In the only other study found involving 
the two revisions of the S-B, Budoff and 
Purseglove (1963a) used a sample of institu- 
tionalized mentally retarded adolescents, tested 
1 month apart with both forms in counter- 
balanced order. In spite of the restricted 
range in the sample, they obtained a product- 
moment correlation (7) of .90 between the 
two tests. Since Form L-M contains items 
drawn predominantly from Form L, the prob- 
lem of practice effects on overlapping items 
may not have been cancelled out by the 
interval of 1 month. 

The two studies surveyed indicate a healthy 
relationship between the 1937 and 1960 S-Bs. 
The results of one study gave evidence that 
the 1960 S-B results in lower mean IQs for 
the Very Superior, even when ratio IQs are 
converted to DIQs. These two studies illus- 
trate one big gap in 1960 S-B research: the 
lack of studies based on subjects drawn from 
the entire spectrum of intellectual ability, 
rather than from the gifted and retarded ends 
of the distribution. 


Wechsler Intelligence Scale for Children 

It is perhaps ironic that the WISC, which 
had to “earn its spurs” by demonstrating a 
relationship to the 1937 S-B, should now be 
used as the criterion against which the 1960 
S-B is tested. 3 

Estes et al. (1961) obtained a correlation 
of .74 between the S-B 1960 and the WISC. 
This study did not find that age produced 
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a discrepancy between WISC IQ and S-B IQ. 
It did support previous reports, summarized 
by Littell (1960), that intelligence level plays 
a determining role in the discrepancy between 
the IQs obtained from these two scales. In 
this, and in previous research, the S-B pro- 
duced higher IQs in the Superior range. The 
direction of the discrepancy is, of course, 
to be anticipated in view of the larger 
standard deviation of the Binet scale. 

Several studies have compared the two tests 
of intelligence with samples of mentally re- 
tarded children. In a study with familial and 
organic retardates, Rohs and Haworth (1962) 
obtained, for the combined sample, a correla- 
tion of .69 between WISC Full Scale IQ and 
S-B IQ. Both the WISC Verbal and Per- 
formance IQs were significantly correlated 
with S-B IQ. As expected, a higher correlation 
for the verbal scale than for the performance 
scale was obtained. Although this study em- 
ploys a group at the opposite end of the 
distribution from that of the Estes et al. 
(1961) study, both obtained a significantly 
lower mean score on the WISC Full Scale 
than on the S-B. This result certainly cannot 
be explained by the lower ceiling on the 
WISC, which could be the basis for the find- 
ing by Estes et al. in which a group of chil- 
dren in a university training school were 
studied. Nor can it be explained by reference 
to the respective standard deviation of the 
two scales, since the larger standard deviation 
of the Binet scales means greater dispersion 
of scores in comparison to the WISC. 

Finally, Huttton (1964) compared a sam- 
ple of children from a special class on a 
task common to both the S-B and the WISC: 
digit repetition. He found that his subjects 
tended to be significantly more successful on 
the S-B repetition than on the WISC Digit 
Span subtest. Apparently the controversy of 
consecutive testing versus the standard S-B 
procedure will be reopened with the appear- 
ance of the new revision. 


Wechsler Adult Intelligence Scale 


Since the upper age limit of the new S-B 
now exceeds that of the WISC, studies com- 
paring the S-B and the WAIS can be antici- 
pated. To date, two studies have appeared 


in which both the WAIS and the S-B were 
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employed, one with retardates and the oùe 
with gifted adolescents. In the study with® 
gifted, Kennedy et al. (1960) obtained a: 
of .52 between S-B IQ and WAIS Full Ss 
IQ. Perhaps because of the restriction è 
range (mean S-B IQ of 137), this correlat 
is lower than any in the summary of Bis 
and Wechsler correlations reported by Litt 
(1960, pp. 136-138). In this study, a nons 
nificant correlation between S-B IQ and WAI 
Performance IQ was obtained. This may} 
due, again, to the extreme restriction in rang 
and to the small size of the sample (N = 2) 

In the second study, Fisher, Kilman, a 
Shotwell (1961) administered both scales to: 
sample of familial or undifferentiated retar 
ates in counterbalanced order. The subject 
ranged in age from 18 to 73. Correlations b 
tween the two instruments ranged from 1 
at ages 18-34 to .777 at ages 55-73. Of è 
180 subjects, only 3 had an S-B IQ highe 
than their WAIS IQ, and an analysis of vat 
ance of the difference scores indicated thë 
age, but not IQ level, was significant in d 
termining the magnitude of the discrepan 
between the two IQs. This is in contrast wi 
the Estes et al. (1961) study with the WIS 
in which the reverse was found: intelligent 
level but not age determined the discrepat? 
between the IQs. The results of the Fisher“ 
al. study are in keeping with previous 1 
search comparing the WISC and 1937 $È 
in which WISC Full Scale IQ was found u 
be somewhat higher than the S-B IQ ff 
defective children (Littell, 1960). 

The reversal of findings obtained by Es 
et al. and Fisher et al. may be due to sev 
factors, First, two different Wechsler 
are involved and perhaps what is true for Ù 
WAIS is not true for the WISC. Secondi 
these two studies differ significantly in tH 
populations from which the samples we 
drawn, particularly in age level and " 
intellectual level, 


Draw-A-Man Test 


The DAM turns out to be the one inst 
ment most frequently employed for comps 
son with the new S-B. In fact, with {0 
studies, it equals the combined totals f" 
the WISC and WAIS. 

The most exhaustive study of the DAMS 
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relationship is that of Kennedy and Lindner 
(1964) in which the sample consisted of 1,800 
Negro elementary school children in the 
southeastern states. No other study surveyed 
tivals this one for sample size or for the 
spread of talent in the sample. With the 
scoring weights assigned in the original deri- 
vation of the DAM (Goodenough, 1926), 
these investigators obtained correlations rang- 
ing from .59 at Grade 1 to .29 at Grade 
4. When the authors recomputed the DAM 
score on the basis of their own weights, 
the r for the total sample was substantially 
increased to .67. This procedure, however, 
has not been cross-validated and therefore 
the obtained correlation is, to some degree, 
spurious. 

In the studies with the original scoring 
method, Estes et al. (1961) obtained an r 
of 43 for a sample of children in a university 
school, while Thompson and Finley (1963) 
obtained an r of .67 with a group of chil- 
dren referred for school guidance. A non- 
significant correlation (r = .28, N = 46) was 
reported by Rohs and Haworth (1962) for 
a group of institutionalized retardates. Once 
again, failure to correct for restriction in 
range may be a contributing factor for low 
correlations, 


Other Individual Tests 


A few investigators have compared the new 
Binet with tests other than those previously 
mentioned, Briefly, these studies include the 
following: Budoff and Purseglove (1963b), 
using the Peabody Picture Vocabulary Test 
(PPVT) with institutionalized retardates, 
obtained an r of .88 for Form A of the PPVT 
and an r of .83 for Form B. They found 
different levels of relationship for trainable 
ind educable groups, with lower coefficients 
for high-grade groups. Sternlight (1965) ob- 
tained a correlation of .90 between S-B and 
Kuhlmann Test of Mental Development IQs. 

this study, the sample consisted of 
"cently-admitted retarded patients between 
the ages of 4-0 and 10-9 for whom no basal 
‘fe on the S-B could be determined. This 
soup had a mean S-B IQ of 26.6 with a 
štandard deviation of 7.4, resulting in an 
treme restriction in range, 
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It is difficult, on the basis of two studies 
with small samples, to make any meaningful 
generalizations in this area. Apparently both 
the PPVT and the Kuhlmann correlate about 
as well with the new S-B as do the Wechsler 
scales. 


Developmental Schedules 


While developmental schedules do not, as a 
rule, purport to measure intelligence, the rela- 
tionship between tests of intelligence and de- 
velopmental scales has always been of inter- 
est for both practical and theoretical reasons. 
Instruments such as the Gesell Developmental 
Schedules measure four areas of behavior, and 
the Vineland Social Maturity Scale measures 
development in the social areas. These can 
be administered to very young children, or, 
in the case of the Vineland, to individuals 
who are familiar with the child or older 
person. If these instruments can be demon- 
strated to be significantly related to intel- 
ligence-test scores, the problem of testing 
infants or otherwise untestable subjects would 
be somewhat simplified. 

In a study of the Vineland, Fisher et al. 
(1961) obtained a correlation of .405 be- 
tween S-B IQ and Vineland Social Age (SA) 
for 180 mentally retarded subjects in state 
schools. There was a trend for both S-B IQs 
and Vineland SAs to decrease with age for 
the sample that ranged in age from 18 to 72. 
Except for an unaccountable drop for ages 
35-44, the correlation between the two instru- 
ments increased as age increased. 

While not a study of the concurrent valid- 
ity of the S-B, the study by Share, Koch, 
Webb, and Graliker (1964) is the only one 
discovered that deals with the Gesell Sched- 
ules and the S-B. This study is actually con- 
cerned with the predictive validity of the Ge- 
sell for a sample of children with Down’s 
Syndrome (Mongolism). This study found 
that the Gesell predicts later S-B scores, with 
correlation ranging from .93 to .57, for In- 
telligence Quotient and Developmental Quo- 
tient, depending on the interval between the 
administration of the Gesell and of the S-B. 


Group Intelligence Tests 


Relationships between the S-B and the 
California Test of Mental Maturity (CTMM) 
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were reported in two studies. Baldwin (1962) 
administered the CTMM and the S-B to 
kindergarten children who had been judged 
to be gifted by their teachers after 6 weeks 
of class and after 6 months. Using an S-B IQ 
of 130 as the criterion for giftedness, he 
found that the CTMM agreed with the S-B 
criterion in 39% of the cases. It is difficult 
to translate this finding into the usual corre- 
lational form of expressing a relationship. 
Tatham and Dole (1963) administered a 
short form of the CTMM to two successive 
samples of children at a university elementary 
school. For the first sample, the S-B and the 
CTMM correlated .41, and, for the second 
sample, .56. 

In an effort to locate gifted pupils, several 
group tests were administered by Blosser 
(1963) to all members of a ninth grade. To 
those pupils who fell at or above the median 
on either the Otis Quick Sort of the Henmon- 
Nelson, the S-B was administered. For this 
group of 187 students, the correlation be- 
tween S-B and Otis IQs was .66 and, be- 
tween S-B IQ and Henmon-Nelson raw score, 
Be [7 2a 

In this area, we can accept a conclusion 
offered by Littell (1960) in his summary of 
the studies of the WISC and its relationship 
to group intelligence tests: “The small num- 
ber of studies precludes more than the very 
tentative acceptance of these conclusions [p. 
141].” 


Achievement Tests 


There is a paucity of studies relating S-B 
scores to scholastic achievement, Kennedy et 
al. (1960) administered the Sequential Test 
of Educational Progress (STEP), Mathe- 
matics, to a group of superior 11th graders, 
Total score on the STEP correlated .50 with 
S-B IQ, while STEP Theory and STEP 
Sets and Functions correlated 48 and .53, 
respectively, with IQ. Kennedy, Van de 
Reit, and White ( 1963a) administered the 
S-B and Form W of the 1937 revision of the 
California Achievement Test (CAT) to a 
large sample of Negro elementary school chil- 
dren. Correlations between S-B mental age 
and CAT grade placements were as follows: 
Reading, .68; Arithmetic, .64; Language, .70; 
Battery, .69. Tatham and Dole (1963) de- 
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rived a measure of school knowledge fre 
the CAT and obtained a correlation of $ 
between S-B IQ and school knowledge. Te 
sample consisted of 44 children enrolled ina 
university training school. 


Teacher Ratings 


It has long been thought that judgments by 
classroom teachers are fallible criterion mes 
ures for purposes of validating a test d 
intelligence. Recent studies with the $-B & 
nothing to disturb this impression, From & 
pool of 200 pupils, Baldwin (1962), in the 
study previously mentioned, found the teat 
ers judged 46 as gifted after 6 weeks and, al 
6 months, 40 were so judged. The mean $3 
TQ of all children selected by the teachers œ 
the two occasions was 121.3, with a range ú 
92-158. Only 38% of those selected as gifted 
at 6 months met the definition of gifted: 4 
S-B of 130 or higher. It should be point 
out that the investigator’s definition of gifte- 
ness is rather narrow and unrealistic an 
that the instructions to the teachers lacked 
precision as to the meaning of “giftedness” 

In the study by Kennedy et al. (19633) 
involving Negro elementary school children, 
significant correlations between ratings by 
teachers and S-B IQ were obtained, but thet 
were of low order. The overall ratings cor 
related .32 with the S-B. Other correlation 
between the S-B and teacher ratings weë 
as follows: Reading, .30; Spelling, li 
Writing, .25; and Discipline, .15, y 

Terman & Merrill (1960) continue, as il 
the past, to offer evidence for the validity o 
the S-B based on internal, built-in evident! 
(e.g., selection of items according to age 0 
the 1937 scale) rather than on the correlation 
of the scale with external criteria. It is obvi 
ous, however, that the S-B continues in i 
role as the external criterion for newly 
developed scales and, rightly or wrongly, ® 
the major criterion for giftedness, Apparently 
only the Wechger scales, particularly tht 
WISC, are generally accepted as an adequat 
gauge for exploring the validity of the $3. 


CHARACTERISTICS OF THE S-B AS A 
MEASURING INSTRUMENT 


i So firmly entrenched as a measuring devit 
is the S-B that little concern is express 


; 
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particularly in research, for such matters as 
reliability, the effects of various incentives, 
and the influence of situational variables, such 
as “warm” versus “cold” examiners. Several 
recent, albeit infrequent, studies bear on some 
of these points. 


Reliability 

The chief line of evidence for the reliability 
of the S-B continues to be derived from the 
test manual (Terman & Merrill, 1960), and is 
based on biserial correlations between items 
and total score. As the test authors point out, 
average biserials tend to be highest at the 
adult levels, ranging from a high of .80 at 
the Superior Adult IT level to a low of .64 
at the Superior Adult III level. The lowest 
average biserials are obtained at the preschool 
level, with the lowest average obtained at 
age 3 (.53). 

There appears to be only one study con- 
cerned with test-retest reliability. This is the 
study of Share et al. (1964) in which young 
children with Down’s Syndrome were tested 
with the S-B at a l-year interval. The 
correlation for IQs was .88 and, for mental 
ages, .86. Considering the problem of restric- 
tion in range, the study indicates that the 
S-B has adequate test-retest reliability, even 
at the younger age levels. 


Sensitivity to Other Factors 


Rarely, if ever, does a psychological ex- 
aminer assume that the test results obtained 
ftom S-B (or any other test) represent raw 
ability, uncontaminated by such factors as 
the child-examiner interaction, the child’s 
Previous experiences in test and nontest situa- 
tions, etc. Unfortunately, the research with 
the $-B does not shed much light on the 
relative importance of extratest influences. 

Variables in the Test Situation. Tiber and 
Kennedy (1964) studied the effects of various 
Mcentives on the test performance of second 
and third grade children from three social 
Soups. The social groups consisted of middle- 
dass white children, lower-class whites, and 
Ower-class Negroes. The incentive groups 
consisted of verbal praise, verbal reproof, 
candy reward, and control, No significant 

erences between incentives groups nor in 
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the interaction between type of incentive and 
social group were obtained. Differences were 
obtained between different class and caste 
groups (these will be discussed in a following 
section). Cieutat (1965) explored examiner 
variables with a large group of children and 
found significant differences among examiners. 
Female examiners elicited IQs significantly 
higher than those obtained by the male ex- 
aminers. The sex of the subject was not a 
significant factor in these results. 


Range of Application of the S-B 


It has been known practically since Binet 
testing began in this country that considera- 
tion must be given to such factors as group 
memberships (social class, racial groups, 
urban-rural residence, etc.) in interpreting 
test results. Research with previous editions 
of the S-B has amply demonstrated the com- 
plex influence of such memberships on test 
performance. Whether it is necessary to redo 
the voluminous body of research with the 
latest revision, which does not result from a 
complete restandardization, is open to debate. 
It appears that the third revision will also be 
subjected to scrutiny regarding the role of 
such influences on test score. 

Southern Negro Children. The problem of 
racial differences, and their origins, in 
intelligence-test performance is still an open 
one and subject to considerable research and 
controversy. Earlier findings regarding the in- 
ferior performance of Southern Negro chil- 
dren on the S-B seem to be confirmed with 
the 1960 revision. The reasons for these re- 
sults are still subject to speculation and fur- 
ther investigation. In the study by Tiber and 
Kennedy (1964), lower-class Negro children 
obtained a mean IQ of 77.39, considerably 
lower than the means obtained by either 
lower-class whites or middle-class whites 
(93.96 and 107.59, respectively). In a study 
with a large stratified random sample of 
Negro elementary school children in south- 
eastern states, Kennedy et al. (1963a) ob- 
tained a sample mean of 80.7 (SD = 16.4) 
and reported that, in addition to a signifi- 
cantly lower mean score, the distribution of 
Negro scores is leptokurtic and has a nar- 
rower range. The appropriateness of com- 
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paring the scores of Negro children with the 
white, predominately middle-class group of 
children in the standardization group is open 
to question. These writers reported that a 
score of 79 obtained from a Negro student is 
equivalent to a score of 100 obtained from a 
white child, On the basis of this finding, they 
cautioned against the use of a Binet IQ cut- 
off score for making such decisions as special- 
class placement, mandatory sterilization, etc. 

Although the prevailing belief is that the 
Negro childs IQ drops as he progresses 
through school, Kennedy et al. found little 
difference between IQ means at the various 
grade levels. These authors did find significant 
negative correlation between chronological age 
and IQ. While this apparently means that 
there is a decrease in IQ with age for the 
Southern Negro child, Schaefer (1965) ex- 
plains these findings in terms of the method 
of sampling. 

Socioeconomic Status. Most of the current 
crop of studies relating tested intelligence to 
Socioeconomic status were developed within 
the context of studying the Southern Negro 
child. As noted above, Tiber and Kennedy 
(1964) obtained a significant difference when 
comparing the middle-class white child with 
the lower-class white child. Among Negro 
children, Kennedy et al. (1963a) found that 
upper-class children (determined by the 
McGuire-White Index) had a mean S-B IQ 
of 105, while the low-low class (which, inci- 
dentally, made up over half of the total 
sample) had a mean of 79.39. When reduced 
to a measure of relationship, the correlation 
between IQ and score on the McGuire-White 
Scale was only .27. In a Presentation of the 
same data, Kennedy and Lindner (1964) 
show a pronounced trend for means to drop 


as one moves down the McGuire-White social 
ladder, 


SHORT Forms or THE S-B 


The S-B provides for a shorter version for 
purposes of administration by the redistribu- 
tion of credits obtained by four selected tests 
among the six found at any age level. Accord- 
ing to the manual (Terman & Merrill, 1960) 
this would eliminate about one-quarter of 


usual testing time, Another procedure, offered 
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by Wright (1942), is identical to this 7 
cedure with the addition of obtaining th 
basal and ceiling levels on the basis of al 
six tests at any age level. Three studies haye 
appeared which have compared the shorter 
version with the full-length administration, 
All of these studies have the problem of 
administering the short form as a part of the 
full-scale, with which it is then correlated, 
Two of the reported studies recognized the 
problem. 

Silverstein and Fisher (1961) and Silver- 
stein (1963) compared the manual’s sug 
gested short form and that of Wright with 
retarded subjects. The first study was con- 
cerned with retarded adults and the second 
with retarded children, With the retarded 
adults, Silverstein and Fisher found that esti 
mates of IQ with the short forms wer 
accurate within one IQ point in from two- 
thirds to three-fourths of the cases. The mean 
IQs on both short forms was 43.4, less than 
one IQ point difference from the full S-B Q 
of 44.1. This difference was statistically sig 
nificant at the .001 level, but it is of little 
practical importance, In the second study, by 
Silverstein, no significant difference between 
“short” IQs and “full” IQs was found. The 
correlation between Wright’s short form and 
the full Binet was .98. For Terman and Mer- 
rill’s short form and the full scale, the cor- 
relation was .95. In both studies with retard 
subjects, the shorter version of the S-B re 
sulted in a slight underestimation of the IQ 
obtained from the complete administration. 

Kennedy et al. (1963b) compared MA'S, 
but not IQs, with a large sample of Neg! 
school children, The overall correlation for 
the entire sample was .99, but correlations 4 
were lower (.93) for the youngest group 0 
children, who ranged in age from 60 mon 
to 71 months, The abbreviated MA mean m 
found to be two points lower than the fi 
form and the standard deviation was 
months lower. The authors concluded that 4" 
error of more than 5 months of MA is maée 
less than 3% of the time by the use of the 
manual’s suggested short form. This stud! 
Supports the findings by Silverstein ie 
Fisher, and by Silverstein, that the usé y 
the short form slightly underestimates t 
full-scale score, N 
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DIAGNOSTIC VALUE OF THE S-B 


While the Wechsler Scales, with three IQs 
and arrangements of subtests to permit an 
intratest pattern analysis, have been widely 
investigated for its diagnostic value, the S-B 
has not enjoyed much popularity in this area. 
The S-B has, of course, been employed to 
assist in making decisions regarding mental 
subnormality, but research on differential 
Binet performance as a function of nosological 
category has been lacking in the past. If the 
first 5 years of Form L-M research are pre- 
dictive of the future, this void in the Binet 
literature will continue. Only two studies 
bearing on the potential utility of the S-B 
in clinical situations have been located. One, 
by Rohs and Haworth (1962), indirectly ad- 
dresses itself to the problem of differentiating 
between familial and organic retardates. This 
study found three differences between the two 
groups: mean IQ, intrarange item failure, 
and number of intrarange age levels covered. 
The organics obtained a higher mean IQ and 
demonstrated less scatter on the two measures 
employed than did the familials. 

The second study was concerned with the 
evaluation of the effects of neonatal jaundice 
on intelligence-test results. In this investiga- 
tion, Van Camp (1964) found no difference 
between these children and the controls (who 
demonstrated no neonatal disease). A signifi- 
cant difference was obtained within the dis- 
ease category when divided into “severe 
symptoms” and “less severe symptoms” cate- 
gories, with the children with the less 
severe symptoms approximating the mean IQ 
obtained from the control group. 

It is obvious that a considerable amount 
of research must be performed if the S-B 
is to be helpful beyond contributing to the 
understanding of the rate of mental develop- 
ment of the individual patient in the diag- 
nostic process, While research with the WISC 
(see Littell, 1960, pp. 149-152) does not pro- 
vide a sense of confidence that the Binet scale 
will yield patterns and signs that are diag- 
nostically useful, this remains one of the 
unexplored areas of the S-B. 


CONCLUDING REMARKS 


The first 5 years of research with the S-B 
ave been reviewed. In general, many of the 
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results of studies with the 1937 revision are 
pertinent to the newer revision. Such prob- 
lems as testing children from different ethnic 
and socioeconomic backgrounds than those of 
the children who served as the standardization 
sample for the 1937 edition remain. The 
validity of the instrument (apart from its 
correlations with other tests and the reliance 
upon the internal structure of the scale), a 
problem in the WISC that was emphasized 
by Littell (1960), is also a problem for 
the S-B. 

A comment on the initial collection of 
articles on the S-B may be in order at this 
point. Much of the studies can only be de- 
scribed as isolated and fragmented, based on 
correlations of the S-B and another test that 
happened to be in the patient’s files. Most of 
the studies reviewed were based on the re- 
tarded, with the gifted next in frequency as 
the test sample. Very few studies were con- 
cerned with the whole range of intelligence. 
Only the series by Kennedy and his co- 
workers (Kennedy & Lindner, 1964; Kennedy 
et al., 1960; Kennedy et al., 1963a, 1963b; 
Tiber & Kennedy, 1964) represents a serious 
effort to study large samples of children of 
the general school population, but even these 
studies were confined to Negro school children 
in southeastern states. 

On the basis of the reputation of the Binet 
scales, its past performance as a clinical in- 
strument, a standardization more carefully 
performed (for the 1937 revision, at least) 
than for almost any other individual test of 
intelligence, it can be expected that Form 
L-M of the S-B will continue in its role as 
leader. 

There is still, however, a need for carefully 
designed studies with large samples from 
heterogeneous populations. 
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REPLY TO A “CRITIQUE AND REFORMULATION” 
OF BEHAVIOR THERAPY 


S. RACHMAN axo H. J. EYSENCK 
Institute of Psychiatry, University of London 


It is argued that Breger and McGaugh’s (1965) criticisms are misguided and 
that they fail to mention numerous studies and arguments which support the 
view that behavior therapy is an encouraging development and has already 
achieved some therapeutic success, Attention is drawn to various “laws of 
learning” which have been employed in constructing treatment techniques and 
for generating and assessing specific hypotheses. Several doubtful assertions 
made by Breger and McGaugh are discussed and factual errors are corrected. 
Their suggested reformulation of behavior therapy is rejected as being frag- 


mentary, vague, and unconstructive. 


This reply to the recent paper by Breger 
and McGaugh (1965) will confine itself to a 
small number of crucial points; we will not 
discuss in detail, among others, two main con- 
tentions put forward by those authors. One of 
these is their “reformulation,” according to 
which learning conceptions of neurosis should 
make use of the “acquisition of strategies.” 
The suggestions made under this heading are 
so fragmentary, programmatic and elusive 
that we fail to see either their theoretical use- 
fulness or any practical consequences which 
might follow from them; when Breger and 
McGaugh have some actual applications to 
report, or have at least succeeded in showing 
how the major facts of neurotic behavior can 
be accounted for in terms of their scheme, 
then may be the appropriate time to take issue 
with their “reformulation.” The other conten- 
tion relates to their preference for an “Expec- 
tancy x Value” type of theory, as compared 
to a “Drive x Habit” type of theory, to use 
Atkinson’s (1964) phrase. They are, of 
Course, free to make any preference choice 
len like, even without repeating at some 
eth arguments presented many times be- 

€; here too, however, one would require 
Some more direct evidence indicating that 
diti tancy X Value theories give rise to 

erent and more efficient methods of treat- 
a than Drive x Habit theories before 
tering into any formal argument. As this 
ease is crucial to certain other assertions 
vill} by Breger and McGaugh, however, it 
be referred to obliquely again below. 
first criticism made by Breger and 


McGaugh is labelled “science issue”; they 
feel that there is no such thing as “modern 
learning theory,” that there is no agreement 
on sufficient points to make testable predic- 
tions and applications to the treatment of 
neurotics, and that behavior therapists are 
wrong in claiming that their procedures are 
based on scientific theories. Evaluation of this 
point may be aided by consideration of a 
quotation from Sir George Thomson, F.R.S. 
and Nobel-Laureate in physics. He points out 
that 


if differences of opinion . . . are still possible about 
space, time, and gravitation, this is an example of 


of view may lead to identical or nearly identical con- 
clusions when translated into what can be observed. 
It is the observations that are closest to reality. The 
more one abstracts from them the more exciting in- 
deed are the conclusions one draws and the more 
suggestive for further advances, but the less can one 
be certain that some widely different viewpoint 
would not do as well [1961, p. 15). 


Much the same is true in psychology. Mac- 
Corquodale and Meehl (1954), Atkinson 
(1964), and many others have pointed out 
that Expectancy X Value and Drive X 
Habit theories overlap in many ways, and 
give rise to similar predictions, although ex- 
perimentalists may show a preference for one 
or the other of two ways of talking about 
phenomena. But both are agreed about most 
of these phenomena, and it is these which “are 
closest to reality,” and which form the factual, 
scientific basis of behavior therapy. No learn- 
ing theorist of any persuasion would deny 
statements of behavioral laws of this kind: 
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“Reinforced pairings of CS and UCS under 
appropriate conditions produce conditioning”; 
“Intermittent reinforcement slows down ex- 
tinction”; “Nonreinforcement produces ex- 
tinction”; “Different schedules of reinforce- 
ment produce predictably different response 
rates.” It is laws of this type that are made 
use of by behavior therapists, who may choose 
to talk about them in the language of Hull, 
Tolman, Skinner, or any other major learning 
theorist, As an example, consider the work of 
Lovibond (1962) who made detailed predic- 
tions on the basis of the known facts of learn- 
ing theory for the behavior of enuretic pa- 
tients, and showed how in doing so he could 
(a) accelerate recovery and (b) reduce re- 
lapses; Young and Turner (1965) may fur- 
nish another example in the same disorder. 
Many others are given in Eysenck (1959, 
1964), Ullmann and Krasner (1965), Kras- 
ner and Ullmann (1965), Eysenck and Rach- 
man (1965), Rachman (1965a), and others. 
The application of scientific principles to any 
area must be specific, and must be discussed 
in terms of specific results; Breger and Mc- 
Gaugh’s failure to do so makes their ex 
cathedra condemnation meaningless. 

This lack of specificity, unfortunately, runs 
throughout their paper, 

On the critical side, their 
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the literature on the subject? If Breger and 
McGaugh wish, in the other examples quoted, 
to indicate that behavior therapists actually 
speak to their patients and explain the m 
tionale and nature of the treatment to 

then their point is taken even though it doe 
lack novelty, Perhaps they are unaware that 
during the course of therapy, be it desensith 
zation or any other method, the therapist als 
attempts to locate any sources of stress which 
may be provoking or maintaining the neurotie 
behavior. Where possible, these stresses are 
eliminated or at least ameliorated. The cases 
(of psychotic patients in these instances) de 
scribed by Ayllon (1963) and Ayllon and 
Michael (1959) illustrate clearly how im 
provements can be obtained by breaking the 
links between stimulus and response patterns 
as they occur in the patient’s environment 
(Eysenck & Rachman, 1965). 

Breger and McGaugh’s paper is also self- 
contradictory, Immediately after deploring 
the emergence of a so-called dogmatic school 
of Behavior Therapy (“it is unfortunate that 
the techniques used by the Behavior Therapy 
group have so quickly become encapsulated in 
a dogmatic ‘school.’”) they proceed to dis- 
tinguish between the “three different posi- 
tions.” They also imply that behavior therapy 
is oversimplified (e.g. p. 346); in other parts 
of the paper, it is said to be cumbersome (p. 
348). Behavior therapists certainly pursue 
simplicity both in theory and in practice; 
this seems to us to be a desirable aim in itself 
and a welcome contrast to the convolutions 
of other psychotherapeutic theories, This con- 
trast is neatly, if inaccurately, demonstrated 
by Breger and McGaugh themselves. 


The behaviorist looks at a neurotic and sees specific 
symptoms and anxiety, The psychodynamicist looks 
at the same individual and sees a complex intra- and 
interpersonal mode of functioning which may oF 
may not contain certain observable fears or certain 

vioral symptoms such as compulsive motor acts. 
When the psychodynamicist describes a neurotic, his 
referent is a cohering component of the individual's 
functioning . . . etc, [p. 349]. 


The doubtful assertions contained in the 
paper by Breger and McGaugh are numerous 
and cannot be reproduced in full, The follow- 
ing examples could be multiplied without ef- 
fort. “What is learned,’ then, is not a me- 
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chanical sequence of responses but rather, 
what needs to be done in order to achieve 
ieme final event |p. 342]? Is all learning 
tally an attempt at achievement? Have 
seurotic patients presumably also “learned 
what needs to be done” in order to achieve a 
neurosis? A conditioned PGR is, likewise, a 
doubtful achievement. The list is endless, but 
in any event who decides “what needs to be 
done,” or what a “final event” is, or when it 
is achieved? The phrase “some final event” is 
hardly a model of precise definition. 

Another doubtful assertion is the statement 
that Harlow’s experiments with monkeys pro- 
vide a “much better animal analogue of hu- 
man neuroses than those typically cited as 
experimental neuroses [p. 356].” This cava- 
lier dismissal of the mass of work in the sub- 
ject of experimental neuroses (see Broadhurst, 
1960; Massermann, 1943; Wolpe, 1952; etc.) 
is neither explained nor justified by Breger 
and McGaugh, Their attitude to the evidence 
seems to stem from a belief that “saying so, 
makes it so.” 

Their assertion that the “attribution of be- 
havior change to specific learning techniques 
is entirely unwarranted” is also misguided and 
appears to be based on ignorance of the rele- 
vant evidence. No mention is made of the 
experiments of Lazarus (1961), Wolpe 
(1952), Eysenck (1964), King, Armitage, & 
Tilton (1960), Lovibond (1962), or of the 
Studies of Ayllon and his co-workers (1959, 
1963). They will further be surprised by the 
accumulation of recent studies which bear on 
this point and which, with minor exceptions, 
corroborate the viewpoint of behavior thera- 
Pists (see Eysenck, 1964; Eysenck & Rach- 
man, 1965; Krasner & Ullmann, 1965; Rach- 
Man, 1965; Ullmann & Krasner, 1965, among 

). The currently available evidence will, 
We feel certain, convince all but the most bi- 
ased workers that the methods of behavior 
serapy are indeed effective in the modifica- 

n of neurotic behavior. Not all the meth- 

are successful; nor is it yet possible to 
all types of disturbances successfully. 
€ is an immense amount of develop- 
mental work and experimentation which re- 
to be done, but a degree of optimism is 

not misplaced, 
teger and McGaugh are surely correct in 


drawing attention to the deficiencies of learn- 
ing theory; most of their criticisms, however, 


ceptual constancy, for example, have been 


and Taylor (1962), and the restating of their 
complex arguments and experiments would be 
out of place. The concept of reinforcement 
is of course replete with complexities and 
seems to us to be best regarded in terms of 
Mowrer’s two-factor theory (1960). The dif- 
ficulties which arise from a consideration of 
central activities such as thinking were dis- 
cussed in an earlier review by Metzner (1961) 
—one which they appear to have 
again 2 years later (Metzner, 1964). 
Certainly, it would be exceedingly foolish 
to regard “learning theory” as a complete, 
coherent, and final account of human be- 
havior. This does not mwb epei pea 
people engaged in therapy ignore 
established findings and the best available 
theories. Quite the contrary. We feel that they 
are obliged to use these findings and ideas 
wherever it is feasible to do so. Furthermore, 


generis. 

Perhaps the most revealing reflection of the 
attitude of Breger and McGaugh to the en- 
tire subject of behavior modification is con- 
tained in their curiously unimaginative de- 
scription of Skinner’s work as “exercises in 
animal training.” Some notion of the wider 
significance of the pecking of pigeons can 
easily be ascertained from the work of Staats 
and Staats (1964) and Krasner and Ullmann 
(1965) among others. 7 

Not merely doubtful, but definitely wrong, 
is the assertion that behavior therapists 


have partly avoided this problem [generality] by 
pe their attention on those neuroses that can 
be described in terms of specific symptoms (bed- 
wetting, if this is a neurosis, tics, specific phobias, 
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ete.) and have tended to ignore those conditions 
which do not ft their model, such as neurotic de- 
presioni, general unhappiness, obsessional disorders, 
and the kinds of persistent interpersonal entangle- 


ments that characterize so many neurotics [p. 348]. 


This is wrong factually in two respects. 
Firstly, a large number of patients with inter- 
personal anxiety and a moderate number of 
obsessional patients have in fact been treated 
(e.g., Lazarus, 1963; Wolpe, 1958). Secondly, 
Wolpe (1958) and most other therapists did 
not focus their attention on anything in par- 
ticular other than the symptoms presented by 
their patients, who were not selected or chosen 
by the therapists, Others, like Lovibond 
(1962), Lang and Lazowik (1963), Yates 
(1958), and the present writers (Eysenck & 
Rachman, 1965) have indeed experimented 
with specific symptoms, but not in order to 
avoid the theoretical problem of generality— 
the reason was simply that if specific predic- 
tions are to be tested, then responses must by 
preference be accurately measurable, It is 
possible to count the rate at which tics occur, 
the number of wet nights per week, or the 
strength of a snake phobia; therefore, it is 
possible to experiment with the effect of 
changing various independent variables on 
these dependent variables. This choice there- 
fore permits the testing of quite precisely the 
sort of predictions which according to Breger 
and McGaugh cannot be made from learning 
theory principles; it would be interesting to 
hear their explanation of just how it is that 
verification has usually followed prediction! 
Finally, we turn to criticisms of “claims of 
success.” Breger and McGaugh state that “the 
most striking thing about this large body of 
ppt that they ry Aaya all case studies, 
careful reading o original sources re- 
veals that only one study (Lang & Lazowik, 
1963) is a controlled experiment [p. 351].” 
This is simply not an accurate statement of 


tes 


the Position as it obtained at the time of 


Walsh (1961), Lovibond (1962), and others 
and their horizon is clearly bounded, as they 

the fact that theirs 
be a comprehensive re- 
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view of the behavior-therapy literature, 
Rather, it is based on a survey of all the 


appeared (Bandura, 1961; Grossberg, 1964). 
This seems to us an inexcusable defect. Be 
havior therapy may be said to have begue 
properly around 1958-59, with the publics 
tion of the Wolpe (1958) book and Eysenck’ 
(1959) paper proposing the name “behavior 
therapy” and stating in some detail its nature 
and purpose. Given that controlled experi 
ments take several years to execute, write up, 
and publish, it is clear why summaries of thè 
field published in 1961 or even 1964 would 
not be adequate substantiation for such a far- 
reaching condemnation of a whole branch 
of study. Familiarity with Behaviour Re 
search and Therapy (Pergamon Press), à 
journal concerned entirely with research in 
behavior therapy and nowhere referred to by 
Breger and McGaugh, would have served ade 
quately to bring them up to date in this field. 
(It may be added that several controlled 
trials of behavior therapy are in progress, to 
our knowledge; three of them prospective and 
one retrospective, Marks and Gelder, 1965, in 
the Maudsley Hospital alone.) Even the Ey- 
senck and Rachman (1965) textbook, which 
went to press 6 months earlier than the 
Breger and McGaugh article, is very much 
more up to date than their account (addi- 
tional evidence is discussed by Cooke, 1965; 
Davison, 1965; Paul, 1964; Rachman, 
1965a, 1965b). 

We must say, indeed, that we feel quite 
strongly that the burden of Breger and 
McGaugh’s criticism is entirely misplaced. 
In half a dozen years a relatively small num- 
ber of behavior therapists, with little official 
Support and often against the most hostile 
Opposition, have succeeded in carrying out 
more controlled (and better controlled) 
Studies than have hundreds of psychiatrists 
and psychoanalysts in 60 years, with all the 
financial resources and the prestige so readily 
available to them, Even so, we do not con- 
sider our studies as in any way beyond criti- 
cism, nor do we feel that they go nearly {at 
enough, or are sufficient to establish behavior 
therapy as superior to other types of therapy 
in any definitive way. We have conclud 
'n our textbook (Eysenck & Rachman, 1965) 


studies reported in the two reviews that have 


fi 
ed 
' 
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that “the routine use of these methods is 
endoubtedly not yet feasible; it must await 
further improvement of techniques and defni- 
tive evidence of superiority over other avail- 
able techniques [p. xii].” This is still our view, 
and nothing said by Breger and McGaugh 
would seem to contradict this summary or 
throw doubt on its accuracy. To call views 
of this kind “dogmatic” seems a curious mis- 


understanding of the meaning of the word. 
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learning theory espoused by the behavior-therapy group, The claim that 


As we pointed out in our analysis of cur- 
rent “learning theory” approaches to psycho- 
therapy and neurosis (Breger & McGaugh, 
1965), it is essential to distinguish clearly 
between learning theory, the practice of be- 
havior therapy, and the effectiveness of treat- 


sider them in the light of Rachman and Ey- 
senck’s (1966) comments. 


Learning Theory 


Ma LLY of Peripheral S-R theory to account 
or 


pectancy or a drive theory; the f 
“Drive X Habit” theory to which Gane. 
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inadequacies of the 
learning theory is countermanded 


and Eysenck refer, and that on which 
behavior therapists continually rely, has 
unable to handle the facts of learning. 
reference to “laws of learning” that “no 
ing theorist of any persuasion would ¢ 
further avoids the issue. What Rachman Lr 


produce conditioning,” many would deny h 
this finding is a “law of learning” that can b 
generalized to situations which differ in si 
nificant ways from those obtaining in a 60 
ditioning experiment. The argument that the 
Phenomena “are closest to reality” and 
they “form the factual, scientific basis 0 
behavior therapy” is doubly misleading. T 
phenomena (e.g, CS and UCS, extinctio 
effects of different schedules, etc.) are clo 
to the “reality” of a highly artificial 
tory situation, but only contact other “reali 
ties” by what Chomsky (1959) has calle 
“analogic guesses formulated in terms of 
metaphoric extension of the technical 
lary of the laboratory [p. 30].” This is sta 
revealed when one sees how little the ac 
content of behavior therapy resembles 
laboratory conditioning study. Behavior thera 
pists utilize a learning theory which has great 
difficulty in dealing with behavior sequence 
and complex transfer phenomena, yet it i 
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pot such sequential behavior and complex 
wansfer effects that are manifested by their 
patients, The neurotic presents a complex 
symptomatic picture, not a discrete “neurotic 
habit” that is analogous to a conditioned re- 
yponse. Furthermore, the behavior therapists 
daim rather widespread changes as a result of 
their treatment methods (e.g., improved inter- 
personal relations, increased productiveness), 
changes that are extremely difficult, if not im- 
pasible, to explain within their “learning 
theory” framework. 


Behavior-Therapy Techniques 


Although Rachman and Eysenck indicate 
that the techniques of behavior therapy were 
solely or very largely derived from learning- 
theory findings and ideas, it is clear that the 
techniques in question were in existance long 
before Rachman and Eysenck’s learning 
theory. Pfaundler described an apparatus for 
treating enuresis in 1904 that greatly resem- 
bled Mower’s conditioning technique, and 
Nye, a pediatrician, outlined a proposed 
method for treating enuresis in 1830 that in- 
duded all of the elements of “conditioning” 
therapy (both cited in Lovibond, 1964). Cir- 
Cus animal trainers used “operant” and “shap- 

techniques for centuries without the 
benefit of “learning theory.” Thus, it is not 
true that learning theory has been necessary 
even very important in developing the spe- 
tific techniques in question, Rachman and 
Eysenck are also incorrect when they imply 
that the observable effects produced by some 
training technique (such as operant condi- 
ng or desensitization) are support for the 

g theory from which these techniques 

Were “derived” in the sense that some very 
“Pecific experiment supports the theory from 
it was deduced. The prior existence of 

it techniques as well as the great dissimilar- 

Y between what goes on in behavior therapy 
niia most learning experiments indicates 

"at the relationship between theory and tech- 

1S nonspecific. 
i our previous paper, it was pointed 

t that many different activities go on 

3 behavior therapy (few of them re- 

ling conditioning and many of them re- 

g traditional psychotherapy), thus 


confounding any attempt to attribute effects 
to behavior therapy per se. Rachman and Ey- 


PERL 

: ay! i 
HHE 

i TH 
annii 
UPH 


Effectiveness of Treatment 


In addition to their failure to control what 
goes on in behavior therapy, we originally 


inal criticisms are 
(Many of these references are not new, of 
course, since they consist of volumes of cai 
thologised reprints.) Let us examine some 0 
ys a cited by Rachman and Eysenck, 
beginning with Eysenck (1964). ý 
This volume presents 42 separate articles; 
19 are behavior-therapy case studies (9 of 


1Jt is not true, betta that nyse si 
earch and Therapy, was ©. .. Now! 
leva er by us. See Breger and McGaugh’s (1965) 
reference to Lazarus (1963). 
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these report only a single case); 13 are case 
studies using operant techniques; and 2 de- 
scribe other methods. All 34 of these case 
studies are fully subject to the sort of biases 
we pointed out previously, Six articles are 
attempts at theoretical treatment, and two, 
and only two, are studies with some sort of 
control. These are Lang and Lazovik (1963), 
originally cited in our review as the only ex- 
ample of a behavior-therapy study with some 
concern for control, and an article by Anker 
which re-reports the study (cited separately 
by Rachman and Eysenck) by Anker and 
Walsh (1961). Lang and Lazovik found that 
“desensitization is very effective in reducing 
the intense fear of snakes held by normal 
subjects, though it can be questioned whether 
this is a phobia in the clinical sense.” Anker 
and Walsh demonstrated the superiority of 
activity groups (activity consisted of plan- 
ning and putting on a play for other patients) 
over therapy groups in improving the ward 
behavior of schizophrenics. The relation of 
this study to “learning theory” eludes us, nor 
do Anker and Walsh make any mention of it, 
The study is a good example of the kind of 
innovative approach to patient management 
that is closely allied to the work of Fair- 
weather (1964) to which we referred in our 
original paper. Thus, this recent volume ed- 
ited by Eysenck provides no additional un- 
biased data in support of behavior therapy. 
Four other sources are specifically referred 
to by Rachman and Eysenck in Tesponse to 
our criticism of lack of control in behavior 
therapy case studies: Cooper (1963) which, 
unfortunately, was not available to us, Laz- 
arus (1961), Ellis (1964), and Lovibond 
(1964). (Both Lazarus and Ellis also appear 
in Eysenck, 1964.) Lazarus reports that group 
desensitization is superior to interpretive 
group therapy in reducing symptoms, He was 
the therapist for both groups and also assessed 
the results. Need more be said concerning 
possible experimenter bias and lack of con. 
trol? The article by Ellis contains a theoreti- 
cal discussion of “rational Psychotherapy,” 
two uncontrolled case studies, and a report in 
which he compares his own notes on cases he 
has treated with various methods (psycho- 
analysis, psychoanalytic Psychotherapy, and 
rational Psychotherapy). The report shows 
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that Ellis’ patients, as assessed by him, & 
better when he administers his own brand ¢ 
“rational psychotherapy.” This report is fui; 
subject to all three sorts of bias and, so faræ 
we can tell, has little to do with any learning 
theory. 

Finally, there is the book by Lovibosi 
(1964) to which Rachman and Eysenck refe 
several times. This is a very interesting ve 
ume which nicely combines historical review, 
theoretical discussion, and a comparative 
evaluation of several methods for treating 
enuresis. Lovibond presents a rather convine 
ing case that: (a) enuresis is not a neuros 
(he presents evidence from several sources and 
concludes, “The enuretic population differs 
little from the general population of children 
in terms of psychological adjustment.”); and 
(6) that the “conditioning” techniques utiliz 
ing immediate awakening are effective in 
bringing enuresis under control. His general 
discussion makes it clear that a simple condi- 
tioning model is inadequate in conceptualizing 
this condition, as exemplified by his references 
to: “consolidation of the trace [p. 133],” 
“central decision process in voluntary mictu- 
rition [p. 149],” and the like. The following 
quote is taken from his concluding section: 


From this point of view an adequate general 
theory of behavior must be a centralist theory; 0% 
which gives appropriate emphasis to central inte 
grating and regulatory processes. From their neuro- 
physiological aspect, these processes may be tè 
garded as “autonomous” cerebral processes ( 
1949), and from their psychological aspect they mây 
a as the processes of consciousness ÍP- 


The detailed consideration of these refer- 
ences should make it clear that Rachman and 
Eysenck are as careless and misinformed 
about the references they cite as they a 
about the important issues in the field of 
learning. In sum, the case for behavior thet 
apy appears as weak as before. The “n 
references are as subject to bias as those pr& 
viously cited, and the theoretical treatment 0 
issues in the field of learning remains naiv 
and misleading, 
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COMPUTER SIMULATION AND PSYCHOLOGICAL 
THEORIES OF PERCEPTION ' 
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Computer simulations of perceptual processes have often not related directly 
to questions of concern to the psychology of perception and, in particular, 
have regarded perception as a sensory, as opposed to a sensorimotor or active, 
process. Some of the psychological literature which is relevant to the issue of 
perception as a passive vs. an active process is reviewed and the differences 
between these alternative conceptions of perception and the gains to be derived 
from using the active-perceiver model are spelled out. Past computer models 
are reviewed in the light of such psychological theories of perception. A dif- 
ferent simulation program based explicitly on the active-perceiver model of 
perception is then sketched in broad outlines and its potential for doing 
research upon psychological problems is reviewed. 


The present paper discusses theoretical and 
methodological issues which become relevant 
if the computer is to be used for the explicit 
investigation of psychological theories or if 
psychological theory is to contribute to work 
in artificial intelligence. In particular, percep- 
tion has been the domain of much computer 
research over the past 10 years or so. (For a 
relatively recent review of this work see Uhr, 
1963.) 

Current computer models of perceptual 
processes will be discussed and contrasted 
with certain theories of perception. An ex- 
ample will be introduced of a computer model 
of perception which may have advantages over 
current programs in that it is more general 
and has greater potential for simulating im- 
portant and interesting phenomena of percep- 
tion, such as perceptual learning, perceptual 
organization, attention, and selection, In simu- 
lating such processes, the computer model be- 
comes a means for testing a theory or class of 
theories about such phenomena, 


PSYCHOLOGICAL BASIS OF CURRENT CoMPUTER 
Work IN PATTERN PERCEPTION 


In addition to contributing elaborate tech- 
niques, computer work in the area of percep- 
tual processes has shown how a simulated or- 


1 This paper is an outgrowth of work in Progress 
under National Institutes of Health grants M6155 
and PO 1 HDO 1368-01. 

?The authors wish to thank A. C. Cafagna for 
some of the ideas expressed in this paper. 
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ganism can go about absorbing invariants 
from its environment. In so doing, this type 
of research has in effect generated artificial 
organisms, each one of which can often ad- 
dress itself to a wide variety of discrimination 
tasks. Much of this work, however, has not 
been related explicitly and in detail to cur- 
tent psychological theories of perception or 
has from the outset omitted what seem to be 
crucially important variables, with the result 
that important psychological problems have 
not found their way into research using com- 
puter simulation as a tool. Specifically, what 
has been left out in computer studies on per 
ception are the motor and sensorimotor proc- 
esses which accompany perception, and much 
of this paper is devoted to a theoretical dis- 
cussion intended to demonstrate the relevance 
of such processes to perception. 

Past computer work of relevance to the stu- 
dent of perception is that which has gone 
variously under the name of “pattern recog- 
nition,” “pattern perception,” and the like. 
Much of this computer research, seen from 
the point of view of a behavior theorist, might 
be taken to embody a sensory as opposed to & 
sensorimotor theory of perception. The input 
to the “retina” in these studies is processed 
via operations or quasi neurophysiological 
Processes. Operations vary in complexity from 
study to study, and in the most elaborate of 
studies (Finley, 1963: Rochester, Holland; 
Haibt, & Duda, 1956; Rosenblatt, 1958; Uh 
& Vossler, 1963) they resemble some of the 
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neurological processes of visual perception 
which have been hypothesized or reported by 
neurophysiologists and psychologists studying 
certain animals (Hebb, 1949; Hubel & Wiesel, 
1963; Lettvin, Maturana, McCulloch, & Pitts, 
1959), Rather than being conceived as an 
active process, perception is taken to be the 
processing of passively received inputs from 
the environment. 

Two broad senses in which an organism 
may be said to be an “active perceiver” or 
perception an “active process” may be dis- 
tinguished. In the first sense, the activity in- 
volved is that of exploration—search for stim- 
ulation and variation in sensory input. The 
contrast is with random inputs which occur 
simply as a result of a given accidental orien- 
tation of the sense organs, or with inputs 
provided by some outside source such as an 


experimenter. Activity in this sense will in- 


volve, to a large extent, bodily movement (in- 
cluding eye movements), but may also be 
taken to include such acts of attention as do 
hot require overt movements. Furthermore, 
activity in this sense can involve propriocep- 
tion as well as the possibility for interaction 
between sensory and motor events. 

In the second sense, “active perception” 
tefers to the way in which the sensory inputs, 
however received, are processed by the or- 
ganism. An active perceiver in this sense is an 
organism whose perception is dependent on 
Some special contribution, made by the or- 
ganism’s past experience, neurophysiological 
structure, etc., which transforms the raw 
sensory data. The contrast here is with “empty 
organism” or “tabula rasa” theories, which, in 
their classical extreme form, regarded the 
mind as essentially a structureless medium for 
teceiving sense impressions from the outside 
World, 
oe the following, perceiver theories will be 
ia to as active only if they are active in 
ae Senses, As a consequence of the desire to 

Y perception in its sensory and motor 
ane a phenomenon such as attention will, 
te e time being, not be considered if it does 

oe overt motor movements. A brief 

ese of some of the simulations will 

mad e given in order to illustrate the points 

tue above. This will be followed by a cri- 
of these studies, 
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Take, for example, a study by Uhr and 
Vossler (1963). This study concerns a pat- 
tern-recognition program which generates its 
own operators for processing a very extensive 
set of inputs. It is, in general, an example of a 
very powerful program, as pattern-recogniz- 
ing programs go. 

The input environment of the above-men- 
tioned program consists of all patterns that 
can be represented by Os and 1s on a 20 X 
20 matrix. The program then “filters” a given 
pattern with a collection of operators which 
are arbitrary patterns of 0’s and 1’s on a 
smaller (5 X 5) matrix. This filtering process 
is achieved by moving each operator-matrix 
systematically across the input matrix and 
computing: (a) the number of matches with 
the input pattern, (b) the average horizontal 
and vertical position of the matches, (c) the 
average value of the square of the radial dis- 
tance from the central square of the operator- 
matrix of the matches. Consequently, each op- 
erator maps an arbitrary input pattern into 
four numbers (between 0 and 7), representing 
the above four attributes. 

Many operators can be formed and a varia- 
ble number used in any machine run. From a 
collection of operators, generally those are 
chosen which score highly over examples of 
the same pattern (say, different patterns of 
handwritten A’s), yet vary between patterns. 
Operators can also be compounded to form 
“higher level” operators. Furthermore, lists 
of operators may be constructed, some of 
which will be relevant to the discrimination of 
A’s, others of B’s, etc. Matrices of names of 
inputs against operators may be constructed. 
A name gets attached to that string of opera- 
tors which produces the highest score for that 
name. If the string of operators and com- 
pound operators attached, say, to the name A 
identifies a new input as A when, in fact, it is 
not an A, the operators which primarily 
matched the new input are downgraded. Thus 
a continuous weeding out of operators is 
achieved as experience with new inputs ac- 
crues. New operators may then be constructed 
as desired, In short, the model resembles in an 
analogical sense some of the elementary and 
higher-order processes which are known to 
take place in the visual system from the retina 
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to the visual cortex, (vis. Hubel, 1963, for a 
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generate their own integra- 
tions of visual inputs or which operate on 
visual inputs are Grimsdale, Sumner, Tunis, 
and Kilburn (1959), Selfridge and Neisser 
(1960), etc. (For a more complete listing see 
Uhr, 1963.) 

An example of an approach still more ex- 
plicitly oriented toward a biologically and 
psychologically based model of perception is 
the work of Rosenblatt (1958). Rosenblatt’s 
program claims to be more or less explicitly 
based on the work of Hebb (1949). Thus, it 
starts out with a fairly randomly arranged 
“neural net” and develops a theory of statisti- 
cal separability whereby specific neural path- 
ways become established, each identified with 
a particular perceptual response or experience, 
The system includes sensory points (S-points), 
association units (A-units), and response units 
(R-units). There exist localized (systematic) 
connections between A-units and S-points and 
random connections between A-units and R- 
units, Between A-units and R-units a two-way 
feedback system exists such that the activity 
of an A-unit can in turn activate an R-unit 
and vice versa. F connections be- 
tween different units can be of an excitatory 
as well as an inhibitory nature. As in the 


Other simulation programs which are simi- 
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larly based on assumptions about the form. 
tion of cell assemblies in the brain as a fume 
tion of the organism's interaction with an e 
vironment are those by Finley, (1963) and 
Rochester et al. (1956). 

The points to be made are that in these 
computer models perception is viewed as a 
passive rather than an active process. They 
disregard the active nature of perception 
which has been stressed in one {orm or an- 
other in some of the literature on perception 
(Gaarder, 1963; Gibson, 1963; Hebb, 1949: 
Hein & Held, 1962; Hochberg, 1964; vou 
Holst, 1954; Johansson, 1950; Kohler, 1964). 
In the simulations of perception by computer, 
the inputs into the system are generally not 
affected or controlled by an immediately pre 
ceding response, such as eye or head rotation, 
locomotion, etc. Rather, a given stimulus isa 
momentary object projected on a stationary 
retina by an experimenter. Moreover, abstract 
features of input which are classified as in- 
variant are usually not explicitly those fea- 
tures of input which remain invariant under 
the transformations of input which are pro- 
duced by an active perceiver who moves his 
eyes and head or who locomotes through space. 
An exception to this particular point is Rosen- 
blatt’s (1958) work in which a series of 
stimuli presented by the experimenter repre- 
sented geometric transforms of a stimulus 
which an active perceiver might have caused 
to occur as a result of his own eye, head, of 
locomotion movements. No explicit attention 
is devoted to the fact that the visual sense 
organ is capable of motor adjustments and 
that the motoric in fact helps to define and 
delimit the encounters which the organism 
may have with the outside environment. It 
has been pointed out by Hebb (1949), for 
example, that under the translations of a 
input pattern which result when an observer 
looks at a pattern while moving his eyes, his 
head, or while locomoting through space, Ce 
tain parts of the pattern nevertheless remain 
invariant and will evoke invariant retinal ac 
tivity. These invariant parts of a patter, 
suggests Hebb, are straight lines, individ 
corners, etc. which are being focused on. It 
being the case that perception takes place 
while inputs to the retina change continuously 
due to motion of the observer, Hebb concludes 


y considering motion, one could deduce 
about the perceptual elements in 
of which such an organism might per- 
its environment. 
fhe passive approach to perception, since it 
concerned with the contributions the 
i m itself makes to perception, assumes 
at rules or modes of organization of inputs 
somehow given in the environment rather 
partially affected by the nature of the 
m’s interaction with the environment, 
blatt (1958) seems to hint at this point 
hen he says in his theoretical discussion of 
eralization in perception that “similarity is 
[a necessary attribute of particular formal 
ometric classes of stimuli, but depends 
e physical organization of the perceiving 
n, an organization which evolves through 
with a given environment.” How- 
f, for Rosenblatt, a response is not some 
vity which searches out new input or which 
i be stimulated directly from “sensory 
is.” Rather, a response is an activity 
‘can be stimulated only through the 
lation area” cells. It is, therefore, en- 
ly internal to the system or, if not, it does 
beffect new sensory input to the eye. The 
m, in short, does not consider the con- 
i made by the motoric in procuring 
“perceptual input for the organism. As a 
Rosenblatt fails to see that there is 
respect in which attributes of stimuli, 
their straightness, angularity, good 
symmetry, etc., are not given in the 
fonment: exploratory motor activity may 
Mired as well as internal structure. That 
iy be the case is suggested in the work 
(1965), Hebb (1949), Held & 
(1961), Kohler (1964), and Platt 
. Finally, having disregarded the mo- 
a system with properties of its own 
as the ability to act independently in 
arming visual inputs, the above com- 
Studies are also not concerned with the 
ment of a capacity for associating 
otor movements to particular sensory 
Vice versa, that is, with the problem 
motor interaction in perception. 
ssion will now be presented of some 
chological evidence which has al- 
n alluded to and which can be 
for the thesis that perception, or at 
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least perceptual development, ought to be 
formulated as a sensorimotor or ective process 
if perceptual development, perceptual organi- 
zation, and perceptual attention and selec- 
tion are to be studied. 


RELEVANT ISSUES IN THE PSYCHOLOGICAL 
Turory of PERCEPTION 


Models which regard the perceiver as active 
have been advanced in the literature in order 
to account for empirical findings and have also 
been found to provide interesting approaches 
toward the solution of baffling phenomena, 
such as the well-known constancy phenomena, 
that is, the perception of invariance under 
changes of input to the retina. Gibson (1961), 
for example, equates perception, not with 
events happening in the retina at a given 
moment in time, but with events happening in 
the visual system at a given time and over 
time. By the visual system is meant the visual 
apparatus as a whole, which consists, among 
other things, of an eye that is not stationary 
but moving and which causes a continuously 
changing input to the retina of the eye. A di- 
rect quote from Gibson (1963) follows: 


We had to suppose that the role of the senses, their 
sole function, was not to yield sensations. Instead of 
mere receptors, ic, receivers and transducers of 
energy, they appear to be systems for exploring, 
selecting and searching ambient energy. . . . This 
new picture of the senses includes attention as part 
of sensitivity, not as an act of the mind upon the 
deliverances of the senses [p. 12]. 


Gibson suggests that such oe the ob- 
ject, motion, perception of patterns, 
wad’ the like are invariants or higher-order 
variables discovered by a perceiver who is 
moving his eyes, turning his head, locomoting 
through physical space, etc., and whose mo- 
tion-produced sensory inputs change from 
moment to moment as a result of such motion. 
Gibson (1957, 1954) and Gibson, Purdy, and 
Lawrence (1955) have demonstrated in a 
series of experiments how different types of 
transformations in an input to the eye over 
time—whether substantial or artificially pro- 
duced—cause the subject to see and distin- 
guish between and among rigid, elastic, and 
multiple moving things or to perceive rigid, 
nonmoving objects in space. Gibson thus pro- 
vides the notion that the perceptual invariants 
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which are relevant to a living organism are the 
invariants contained in the transformations 
which a living organism performs onto the 
optic array by moving its eyes, head, etc. 
Along similar lines, Bower (1965) has shown 
that motion parallax (i.e., relative displace- 
ment produced in near vs. far objects in a 
visual field under head motion) is necessary 
for the discrimination of angularly equivalent 
but objectively different sizes of objects (size 
constancy). 

Gibson, although he has made much of 
motor exploration, has not emphasized the 
role of kinesthetic and proprioceptive feed- 
backs in perception. Unlike, for example, 
Kohler (1964), whose work will be reported 
in detail later, Gibson deals generally with 
perceptually mature subjects rather than with 
perceptual learning. Gibson could therefore 
remain relatively unconcerned about any spe- 
cial sensorimotor connections which have been 
shown to be forming during perceptual learn- 
ing; for example, in experiments in which the 
normal perceptual world is altered by pris- 
matic spectacles (Kohler, 1964). Gibson 
could take oculomotor integration for granted 
and did not have to be concerned about the 
special effects which Kohler shows are due to 
so-called situational, that is, behavioral, fac- 
tors, 

In the following, a series of experiments 
will be considered which suggest that sensori- 
motor and motor-sensory processes are heavily 
implicated in perception, 

Hebb (1949) presents a theory of percep- 
tion in which, like Gibson, he elaborates fea- 
tures of the environment which might remain 
invariant under the motions of an active 
perceiver. However, unlike Gibson, he also 
suggests that there is a feedback between sen- 
sory inputs and motor behavior in perception, 
He points out that under Specific motions of 
the organism, such as the eye scanning in a 
plane, locomotion in one given direction, etc., 
certain inputs like individual straight lines, 
individual angles, etc. remain invariant and 
can become elements of perception. Or, again 

he points out that if the stimulus is a tri- 
angle, say, specific sensory inputs, like the in- 
put “angle,” are invariably followed by motor 
movements, like eye scans along one of two 
sides of the triangle. What is proposed is a 
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mapping between elementary motor move 
ments and elementary sensory invariants dur- 
ing a relatively enduring length of time. This 
simultaneous arousal of specific sensory and 
motor events leads to mutual facilitation be- 
tween sensory and motor processes in Hebb’s 
system. Certain sensory events can facilitate 
certain motor events, and motor events can 
facilitate certain sensory events. Thus Hebb’s 
theory can account for and emphasize the 
importance of phenomena of selection as well 
as attention in perception: It follows from 
Hebb’s theory that a given perceptual event, 
such as a given momentary discrimination of 
a visual input, is ordinarily not explainable in 
terms of the input alone, nor in terms of the 
state of the visual cortex at the time, but re- 
quires in addition a specification of the state 
of the motoric, 

Sperry (1952), arguing along lines of sen- 
sorimotor functioning as an integrated proc- 
ess, states: 


Our present one-sided preoccupation with the sen- 
sory avenues to the study of mental processes will 
need to be supplemented by increased attention to 
the motor patterns, and especially to what can be 
inferred from these regarding the nature of associa- 
tive and sensory functions, Instead of regarding 
motor activity as being subsidiary, that is, something 
to carry out, serve, and satisfy the demands of the 
higher centers, we reverse this tendency and look 
upon the mental activity as only a means to an end 
where the end is better regulation of overt response. 
: .. Any separation of mental and motor processes 
in the brain would seem to be arbitrary and indef- 
nite [p. 299]. 


Evidence for this point of view is supplied 
by Sperry (1958), The research in point deals 
with the so-called “split-brain” preparation. 
Cats are used whose brain has been split down 
the middle by section of the corpus callosum, 
the optic chiasma, and usually the anterior 
and hippocampal commissures. In these split- 
brain cats, one whole hemisphere is left intact 
in order to maintain generalized background 
functions and to prevent incapacitating paral- 
ysis. Cortical lesions can then be applied to 
specific learning and memory functions within 
the other single hemisphere, leaving intact 
only the critical sections one wishes. It devel- 
ops that when an island of cortex, including 
the area for central vision, is isolated, the rest 
of the cortex having been eliminated, neatly 


COMPUTER SIMULATION 


all previously-trained visual discriminations 
with the eye on the affected side are lost. 
Thus, discriminations between a cross and a 
circle and between perfect and imperfect tri- 
angles are lost, though they can be relearned 
to a very limited extent. Discriminations be- 
tween upright and inverted V’s are lost and 
cannot be relearned. The simplest discrimina- 
tions between horizontal and vertical stripes 
survive, or if lost, can be retrained. If the 
removal of the same area of cortex is carried 
out in steps, and the first step consists of the 
removal of the temporal lobe and all areas ad- 
jacent to the visual island, sparing only the 
frontal region with somatic sensory and motor 
areas, the animals retain most of their pre- 
operatively-trained discriminations at a fairly 
high level, including that for upright and 
inverted V’s. Subsequent removal of somatic 
sensory and motor areas produces an addi- 
tional marked lowering of visual performance 
comparable to the level obtained by the total 
temoval reported earlier. Sperry states: 


It would thus appear that the somatic areas of the 
cortex play some important role in visual discrimina- 
tion... . Possibly the greater functional efficiency 
of the isolated somatic cortex, as compared with that 
of the isolated visual cortex, can be attributed to the 
inclusion of the motor areas within the intact som- 
esthetic island [pp. 419-420]. 


In any case, the question with which the 
author started the experiment, namely whether 
- +. perceptual learning and memory can be 
mediated by sensory cortical area alone or 
whether it is dependent on more complex inte- 
Stations involving the function of other cor- 
tical areas,” seems to have been answered in 
favor of the second alternative. 

Another series of studies which demonstrate 
this sensorimotor nature of perception in the 
living organism are those carried out by Ivo 
Kohler and his associates at Innsbruck 
(1964), Kohler’s experimental findings se- 
Verely question the applicability of the pas- 
Sive-perceiver model. They suggest that in 
order to explain many forms of perceptual 
Phenomena the total situation involved in the 
Perceptual experience needs to be taken into 
account, including head and eye movements. 
of observed the process of perceptual 
s tning by putting prisms of various kinds 

n adult subjects for prolonged periods of 
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time, thereby disorganizing their visual world, 
and then studied their adaptation as well as 
the aftereffects which ensued when they no 
longer wore the prisms. He carried out a series 
of studies with spectacles the upper half of 
which was a prism and the lower half of which 
was ordinary glass. The upper half of the 
spectacles, therefore, caused various distor- 
tions known to be caused by prisms: lateral 
contraction and expansion; slanted horizontal 
and vertical contours; distortion of right an- 
gles as they approached the periphery of the 
visual field (parallel lines no longer appeared 
parallel). Any kind of motion by the observer 
in this prism-induced environment causes a 
perceived world which is exceedingly “rub- 
bery.” Moving the eyes from top to bottom 
behind the glasses, in addition, causes a dis- 
placement at the midline. 

If subjects were to adapt to these glasses 
and see the world relatively veridically, it 
meant that the transformations to which vis- 
ual inputs were subjected by an active per- 
ceiving system had to be materially different, 
depending upon whether the eye or head 
moved up or down. If, further, aftereffects 
which resulted from this condition varied 
with such motions, then the importance of 
the contribution of the motoric to perception 
could be demonstrated. After subjects had 
generally adapted to these spectacles (i.e., 
after wearing them for 50 days or more) and 
perceived objects more or less veridically, 
without a shift at the midline or apparent 
movements caused by head or eye motion 
(except, say, if performed abruptly), they 
removed their spectacles. Kohler then ob- 
served in himself as well as in others distinct 
aftereffects similar to, though the inverse of, 
the effects first experienced upon wearing the 
spectacles. Thus, in walking past a brick wall, 
for example, the subject, when moving his 
eyes up, would see the wall as being in ap- 
parent motion as well as being curved; where- 
as when he moved his eyes downward, the 
same wall would look normal. The change- 
over, moreover, occured about at the midline 
where the original spectacles had divided 
themselves. Thus the subject acted as if he 
were still wearing spectacles. 

From this and many other findings, Kohler 
concluded that perception is a process in which 
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invariants are noted between retinal and other 
processes in the visual pathway, on the one 
hand, and the motor processes employed by 
the active perceiver in attaining perceptual 
inputs on the other. Again, the similarity of 
this conclusion to Hebb’s (1949) is evident. 
Only under conditions in which perceptual 
inputs are invariant over large classes of mo- 
tor movements (as in the experiments in 
which full prisms were used) do the striking 
findings of Kohler’s, in which perceptual im- 
pressions changed by mere changes in eye 
movement, disappear. Thus, the aftereffects 
induced by full prisms are to make many con- 
tours seem curved, regardless of whether the 
eye or head moves up or down, left or right, 
and under these conditions the particular af- 
tereffect recorded above disappears. 

There are various other studies in the lit- 
erature which provide elaborations of Sperry’s 
and Hebb’s thought, and evidence which ex- 
pands the conception of the active-perceiver 
model still further. Von Holst (1954), like 
Sperry, is concerned with sensorimotor events 
and their integration. He postulates for the 
active perceiving organism efferent impulses 
and two sets of afferent impulses: exafference 
and reafference. Exafference refers to those 
impulses resulting from external stimuli (ex- 
clusive of such impulses which may result 
from bodily movement), and reafference refers 
to those impulses which arise because of move- 
ments of the organism. Consider the follow- 
ing experiments by von Holst and Mittel- 
stadt (1950). 

If a fly was put in the center of a black and 
white striated cylinder which was rotating to 
the right, the fly followed its movement by 
also rotating to the right. If the cylinder was 
rotated to the left, tracking took place to the 
left. If, however, the head of the fly was ro- 
tated 180° in relation to the body, so that its 
left eye was on the right and vice versa, the 
visual signals were reversed, and the fly turned 
opposite to the direction of the rotation of the 
cylinder. Von Holst, therefore, Proposes that 
the fly’s eye contains specific neurons that 

control tracking movements. In a second ex- 
periment involving a normal fly, a stimulus 
such as smell was placed to the left of the fly. 
The fly now moved to the left while the cyl- 
inder remained stationary, and it produced 
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the same visual afferent inputs into its eye as 
in the first experiment when the cylinder 
moved to the right and the fly was stationary 
—the only difference being that these inputs 
were now motion-produced. Under these con- 
ditions, the fly did not show the oculomotor 
reflex which it showed in Experiment 1 for 
that same input. That is, the same afferent 
input no longer signified a motion to the right 
by the environment and did not elicit a track- 
ing motion in that direction. The fly merely 
initiated left movement and stopped when it 
reached source of the smell. This second ex- 
periment attests to the fact that efferent in- 
formation is somehow processed by the sys- 
tem and that it contributes to the interpreta- 
tion which is placed on afferent input. Re- 
ported finally is a third experiment identical 
to the second except that the fly’s head was 
rotated 180°. In this condition, a left move- 
ment by the animal which was induced by the 
stimulus of smell produced afferent visual 
input normally associated with movement to 
the right. In this case, it is found that the fly 
continued to move to the left until exhausted. 
The afferent visual input, which is opposite to 
that in Experiment 2 and which is not nor- 
mally associated with the given efferent infor- 
mation, now apparently carried information 
to the animal which excited further action. 
Experiment 3, again, attests to the importance 
of interaction between afference and efference. 

Von Holst (1954) explains these results by 
postulating a built-in summation or compati- 
son between monitored efferent and afferent 
signals. According to von Holst, if the nor- 
mal animal moves to the left in a stationary 
environment, an efferent copy of left move- 
ment as well as the reafferent signals associ- 
ated with such a movement (i.e., a movement 
of the visual field from left to right across the 
retina) are compared in the central nervous 
system. In this case, these two types of sig- 
nals, says von Holst, are of the same sign and 
can cancel each other. This means that this 
particular reafferent input will not signify t0 
the animal changes taking place in the en 
vironment itself and it will not lead to action 
by the animal such as tracking (Experiment 
2). If the same afferent input is not accompa- 
nied by an efferent copy—as when the el 
vironment moves and the fly is stationaty— 


“the meaning of the afferent input is changed 
aad in this case led to a tracking behavior to 
the right to follow the motion of the environ- 
ment (Experiment 1). If there is an efferent 
copy of left motion which is now accompanied 
by reafference of a different sign—as when 
the head was rotated 180° and the visual field 
moved not from left to right but from right 
to left across the retina—then efference and 
feafference do not cancel; the animal inter- 
the reafferent input presumably as mo- 
tion from right to left in the environment and 
continues to track it until exhausted (Experi- 
ment 3). 
What von Holst is saying is that the or- 
ganism records relations between itself (i.e., 
the state of its own motoric) and the environ- 
ment. Thus, according to this model, the state 
of the organism’s motoric is definitely impli- 
tated in its decisions to accept or not accept a 
given input as a stimulus for further action 
and kind of action. Thus, to the fly, a moving 
into the eye can “mean” either that the 
environment or parts of it are in motion or 
that he is moving. Which of these interpreta- 
‘tions is made depends on the state of the 
‘Motoric, 
-Å notion is raised by von Holst’s and Mit- 
telstädt’s data which adds to the theoretical 
‘Considerations which have been broached thus 
far, The relation between the motor process 
and perception has been discussed in regard to 
4S effect on attention or selection in percep- 
ion (Hebb, 1949). What is pointed out by 
Present data, however, is that motor proc- 
es may affect not just the environmental 
Aput that will be attended to by the organ- 
or the specific transformations of the 
ivironmental input the motoric will make at 
j n moment, but motor processes may also 
into the meaning which these inputs will 
ve to the organism. That is, to von Holst 
d Mittelstiidt, perception is the end product 
"4 Comparison or summation between effer- 
and afferent signals and by no means a 
essing of afferent signals alone. At least 
A ìs their interpretation of the limited sets 
Tceptions which they have studied in- 
8 primarily the animal’s orientation 
Tegard to the environment. However, in- 
as the active-perceiver framework 
ts that much of perception involves ex- 
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tensive participation by the organism's mo- 
toric, the model of perception implicit in von 
Holst’s and Mittelstiidt’s data may be far 
more general than the case of spatial orienta- 
tion to which they have applied it. Their 
model, or some equivalent model, should hold 
true for many percepts of space. Percepts of 
space often involve encounters with an en- 
vironment in which the distinction between 
organism-produced inputs (reafference) and 
environment-produced inputs (exafference) 
are continuously pertinent, at least to a living 
organism not restricted to an artificial, labora- 
tory environment. 

A particularly suggestive set of evidence in 
regard to sensorimotor interaction in percep- 
tion has been supplied by the studies on the 
relation between perception and voluntary 
motion contributed by Held (1964), Held 
and Bossom (1961), Held and Hein (1963), 
and Hein and Held (1962). Held (1964) 
argues strongly against the assumption that 


the neurological processing of sensory input is inde- 
pendent of the organization of motor action, that is, 
against the assumption that the analysis of sensory 
input preparatory to motor acts occurs solely in the 
course of a chain of neural events traversing the 
sensory projections but completed prior to impinge- 
ment upon motor centers in the brain. 


In studies of perceptual displacement, Held 
and Bossom (1961) have shown that volun- 
tary motion (as opposed to passively being 
moved about but ostensibly receiving the 
same visual inputs) with its concurrent train 
of sensory and motor feedback provides the 
essential order required for compensation in 
such an environment. That is to say, spatial 
orientation in such an environment was only 
made if voluntary motion was allowed. Fur- 
ther evidence quoted from Held (1964) states: 


Hein and Held have reared kittens with one eye 
open during locomotion in an illuminated surround- 
ing; the other eye was open only during passive 
transport over an equivalent path. After several 
months of such exposure, stimulation of the eye 
that had been open during active movement pro- 
duced normal visually-guided behavior but the other 
eye was functionally blind. These experiments clearly 
implicate the motor system in processes traditionally 
regarded as sensory [pp. 308-309]. 


Held and Hein (1963) also found that self- 


produced movement is necessary for the de- 
velopment of visually-guided behavior such as 
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on the “visual cliff” (Walk & Gibson, 1961). 
Only those cats which had been allowed vol- 
untary motion while given controlled inputs 
of patterned vision during training showed 
any behavioral evidence of depth discrimina- 
tion when put on the “visual cliff,” or evidence 
of visually-guided paw placement. In the test 
involving visually-guided paw placement, the 
subject’s body was held in the experimenter’s 
hands so that its head and forelegs were free. 
It was then slowly carried forward and down- 
ward toward the edge of a table. A normally 
reared animal shows visually mediated antici- 
pation of contact by extending its paws as it 
approaches the edge. Walk and Gibson state 
that peripheral atrophy resulting from lack of 
use of various organs is contraindicated by 
the presence of pupilary and pursuit reflexes 
and the rapid recovery of function of the 
passive subjects once given their freedom. 
Debility specific to the motor system can be 
ruled out, according to the authors, because 
the passive subjects showed the same tactual 
placing responses and other motor activities as 
the normals. 

Held’s data suggest that at least certain 
perceptual invariants are connected with the 
internal “motor language” of the organism. 
What is invariant in the environment to the 
organism in these experiments is some aspect 
of that slice of exteroceptive events which 
occur during some set of specific motor acts, 
that is to say, those inputs into the retina are 
coded which remain invariant over given mo- 
tor movements of the organism. In the ab- 
sence of motor movements to which they can 
become related, certain environmental inputs 
in these experiments tend to be mere noise. 

This set of findings also Suggests extensive 
feedback between sensory and motor signals 
in the brain, As such, these data therefore 
would not be inconsistent with the possibility 
suggested by von Holst (1954) that much of 
perception may have to be accounted for in 
terms of a comparison and summation between 
efferent and afferent signals, In fact, Held 
(1961) has proposed an expansion of von 
Holst’s (1954) theory which suggests that all 
past combinations of concurrent efferent and 
reafferent signals are stored in a correlation 
storage or memory. During any current ef- 
ferent signal, an appropriately-stored reaffer- 
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ent signal is “sent” from the memory toa 
comparator to check on the actually-produced 
reafferent signal, decide on its “meaning,” and 
determine a further response. 

The above model seems to fit the data, It 
should be noted, moreover, that the model 
seems applicable to depth perception as well 
as to self-orientation in space, the latter of 
which was studied by von Holst and Mittel- 
städt (1950), 

As will be seen, Held’s interpretations re 
main somewhat ambiguous in that he has not 
also studied perception in an experimental 
setting in which the animal, rather than mak- 
ing an instrumental response, makes an auto- 
nomic response instead. In so doing, percep- 
tion under mostly exafferent inputs may be 
studied as well. 

Do these studies imply that a passive or- 
ganism, if one could be created, does not or 
cannot perceive? To answer this question, the 
above series of studies must be viewed against 
an experiment which tried to explore precisely 
this question. The study in question is by 
Meyers (1964), and must be compared with 
a study by Riesen and Aarons (1959). Riesen 
and Aarons showed that motion-deprived 
kittens find it impossible to discriminate 4 
rotating environmental object. For the first 
3 months, the kittens in question were given 
an hour of patterned vision a day, and during 
this period they were deprived of head and 
body motion. They were then given the 
to discriminate a stationary target from the 
Same target revolving, discrimination being 
measured by an instrumental response in- 
volving locomotion toward the correct target: 
The discrimination task and rearing were the 
same in Meyers’ (1964) experiment. How- 
ever, in Meyers’ study, discrimination wa 
ascertained not by an instrumental response 
but via an autonomic response (i.e. leg 
flexion was elicited by shock and then condi- 
tioned to the presentation of a given visu 
pattern). The cats in the Riesen and Aarons 
study were unsuccessful in discriminating the 
moving target. They were, however, able 1 
discriminate light intensity and performé 
quite adequately the required instrumental 
response of locomoting toward the target wit 
greater light intensity. The cats in Meyet® 
study, on the other hand, were able to “i 
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criminate the moving target. Clearly, a dif- 
ference between the situation presented by 
Riesen and Aarons versus Meyers, which was 
purposely introduced by Meyers, is that in 
the former case inputs into the eye are pro- 
duced by an organism in motion and that the 
effective solution of the task depends on the 
kitten’s ability to discriminate inputs pro- 
duced by its own locomotor motion from 
inputs due to motions of the environment 
itself. Such a discrimination might (e.g., as 
suggested by Held, 1964; von Holst, 1954) 
have to depend on a history of efferent- 
afferent comparisons of which these cats had 
precisely been deprived. Thus Meyers 
(1964) was successful in training for a dis- 
crimination because he was able to make all 
voluntary movements by the organism, except 
eye movements perhaps, which were not con- 
trolled, irrelevant to the situation.* 
It would seem that as soon as voluntary 
motion by the organism contributes to the 
transformations which the visual inputs 
undergo over time, the data provided by that 
contribution, that is, efference and reafference, 
become necessary variables for the formation 
of perceptual categories. When voluntary mo- 
tion can be excluded during rearing and test- 
ing by special arrangements in a laboratory, 
certain perceptions can occur on the basis of 
visual experience without the concurrence of 
Somesthetic and motor experience. It should 
© remembered, however, that visual experi- 
ence in the Meyers (1964) experiment did 
hot eliminate motor movements of the eye. 
oo the findings suggest, at least for the cat, 
ah at the visual system is at least largely 
ependent of the motor system for the 
coding of visual inputs and that it has its own 
u't-in organization for analyzing some in- 
oo, it is also clear that the per- 
Processes of the living, nonlaboratory 
fe i be described purely in terms of the 
ot les of, say, the visual cortex. Rather, 
ems of efferent-afferent interchange or 
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Rie Other difference between Meyers (1964) and 

Speed pe Aarons (1959) should be noted. The 
ia target rotation employed by Riesen and 

by Mean three times slower than that employed 

this differen The possibility must be considered that 
imi ited might also have influenced ease of dis- 


183 


comparison, problems of the contributions 
made by the motoric in terms of the per- 
ceptual elements which might characterize 
perception, and problems of selection and 
attention will all become relevant. Finally, the 
number of different discriminations that can 
be attained for the “passive” laboratory ani- 
mals will undoubtedly be limited and will 
probably not include certain forms of, say, 
depth discrimination and certainly no dis- 
criminations in which the distinction between 
the organism’s movements and movement in 
the environment is expressly called for. 

Finally, there is a set of studies by Hubel 
and Wiesel (1963) which should be reviewed. 
These authors showed that young kittens 
reared in the dark and deprived of patterned 
inputs from birth up to 8, 16, 19, and 20 days, 
respectively, showed the same operations and 
capacities on the part of the visual cortex— 
such as the detection of the orientation of 
narrow slits of light projected on the retina, 
and the movement of such slits across the 
visual field—which Hubel had earlier (1963) 
found for mature cats. There is a slight in- 
crease in the fineness of the cortical reactions 
as a function of age, however. This, then, 
would suggest a large degree of built-in per- 
ceptual readiness for such things as the detec- 
tion of the shapes of contours, the movement 
of contours across the visual field, etc., inde- 
pendent of any actual sensory or motor ex- 
perience on the part of the animal. The fact 
that young kittens need several days after 
opening their eyes before they stop bumping 
into objects or show signs of avoidance or 
pursuit activity is interpreted by Hubel and 
Wiesel as a lack of visuomotor rather than 
purely visual ability. The problem for the 
organism, as has already been seen, is the 
development of associations between afference 
and efference. 

In the case of the fly (von Holst & Mittel- 
städt, 1950) which has already been reported, 
afferent and efferent signals are appropriately 
paired from the beginning and the problem of 
oculomotor coordination is absent. That the 
problem of oculomotor coordination is con- 
siderable for some animals and requires ex- 
perience-in order to come about is shown in 
a study by Gunter (1951). He reported that 
monocular size constancy can be acquired by 
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l-year-old cats which, due to an accident, 
had lost the use of one eye eight months 
before the experiment, but not by binocular 
cats which were made monocular for the pur- 
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order categories in the visual 
be decided from the data. It would seem 
worthwhile to hypothesize from a phylo- 
genetic point of view that as the motoric gains 
in flexibility, so does the visual cortex. The 
possible implications might then be that the 
motoric in certain animals can contribute to 
the formation of functionally interrelated 
clusters of cortical cells, or phase sequences 
in Hebb’s (1949) sense, which form higher- 
order units for perceptual functioning over 
and above the units, say, which are described 
by Hubel (1963) and Hubel and Wiesel 
(1963) for the cat. Finally, in man, it might 
be interesting to speculate that little if any 
prefixed cortical organization in Hubel and 
Wiesel’s sense is present, and that the 
“addressing” in the visual cortex, to mention 
a term used by Platt (1962) which will be 
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defined in more detail later, occurs largely 
as a function of efferent-afferent invariangy 
which are built up through experience. In asy 
case, it is evident that the problem of percep. 
tion in the living organism is not resolved 
by assuming the presence of detectors, like 
straight line detectors, motion detectors, and 
the like, in the visual cortex. As Kober 
(1964), and Riesen and Aarons (1959) show, 
the assumption of such detectors is not sb 
ficient to explain even the perception of 
straight lines or motion, to say nothing of 
more complex perceptions, In the living of- 
ganism, the problem of afferent-efferent feed 
back enters into the process with a whole st 
of additional problems which, however, it 
crease the richness of the process and add 
to perception the phenomena of attention, 
selection, and spatial orientation, to mention 
only a few. In addition, any discrimination 
between motion-produced versus environment: 
produced inputs may also contribute to the 
perception that environmental inputs possess 
a measure of “objectivity.” 

To conclude this section on psychological 
theory and data on perception, it appears, # 
least for some animals, that perceptions of 
certain kinds may occur without accompany 
ing voluntary motions, except, perhaps, ey* 
motions. It also appears that, for some orgat 
isms at least, there are elaborate builtin 
structures for some kinds of elementary form 
perceptions, but that these are insufficient it 
themselves to guarantee functional perceptual 
capabilities in the living animal which move 
about and has to deal with motion-prod 
visual inputs. It is further suggested that the 
same kind of physical stimuli which are dis- 
criminated when movement is artificially © 
cluded and made irrelevant are no long® 
discriminated when the animal looks at the 
while moving about if the animal has Pre 
viously been deprived of patterned visio 
accompanied by voluntary motion. : 

What can be concluded about the differ: 
ences between active and passive perceptio" 
First, there may be differences in the dimen 
sions of environmental input which are 
stracted by the system as invariant. A W 
host of new invariants are by-products of * 
system which moves about in an “opt 
array” (Gibson, 1963), These are elementa” 


“order invariants, Examples of 
nts are as follows: Under 
ons in visual input induced by 
m, certain properties of the input 
in invariant. Sequences of such per- 
ed invariants may obtain in any given 
i activity. The recurrence of spe- 
in a sequence itself then 
e an abstract property of percep- 
Wh an active system. Likewise, sequence 
“become an abstract property. In fact, 
pi such as these may well be involved 
ie identification of inputs as “objects.” 
e , difference or change can become 
ct property, and this allows for the 
n of visual input in relational as 
d to absolute terms, a tendency on 
of the living organism postulated 
n (1961). The above, then, are some 
ples of invariants which are contained in 
ailable to an active perceptual process. 
d, since the invariants in visual input 
f active perception are those which hold 
t specific motor movements, efferent- 
rent interaction becomes possible. A given 
may then depend on efferent as 
afferent processes. Thus, the state of 
tic of the organism at any given time 
@ great deal to do with what flux 
Ory events comes into the retina next, 
tures of the input will be attended to, 
OW such events are interpreted. 
at are the implications of such specula- 
for the computer simulation of per- 
Processes? (a) It is suggested that 
ter program which is based on a 
perception as a passive process 
code and organize its “percepts” in 
units which are often quite irrelevant 
des and organization employed by a 
ganism. If so, the program may have 
Usefulness as a tool in psychological 
ch. These limitations might become par- 
y Severe in relation to percepts which 
ght to involve much afferent-efferent 
» Such as perceptions of space. 
asSive-perceiver program, by not inte- 
tor behavior with perception, can- 
Simoniously with exploration, at- 
d selection as part of perception. 
ly, any computer program based 
e-perceiver theory cannot be ex- 
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tended to other significant areas of research 
which involve the relation between perception 
and elementary forms of cognition. An er- 
ample of such an area is what Piaget (1954) 
has called sensorimotor intelligence. Problems 
in this area, such as those of expectation, 
purposeful activity, the concept of the object, 
and the discrimination between self and ob- 
ject could perhaps be explored by extensions 
of an active-perceiver program. 

In the following, in an attempt to give an 
“active” rather than a “passive” flavor to this 
critique, the beginning of a computer pro- 
gram which is based on a sensorimotor model 
of perception is sketched in broad outline. 
Preliminary problems of perception which 
might be studied with the help of such a pro- 
gram are indicated. Some of the first phases 
of the work proposed have been completed. 


COMPUTER SIMULATION OF ELEMENTARY 
SENSORIMOTOR PROCESSES OF A 
VISUAL System 


Generation of Perceptual Units like Straight 
Line, Angle, Figure, Pattern, etc. 


In developing a simulation model, a certain 
set of assumptions have to be made which, 
while they are supported by some theories of 
perception, are somewhat at variance with 
others. This should not be disturbing, how- 
ever, inasmuch as the computer model is used 
to test the theory upon which it is based. 
A first attempt is the simulation of an active 
visual system with motion-produced inputs 
and concomitant sensorimotor interaction. 
The visual system at first explores contours 
and boundaries within its visual field. Under 
conditions to be specified, the system might 
then generate systematic scanning behaviors 
and units of perception. 

There is evidence (Kohler, 1964; Marshall 
& Talbot, 1942; Zusne & Michels, 1964) that 
the organism tends to fixate successively on 
various parts of the contours of an object. 
Among modelers of perception both Hebb 
(1949) and Platt (1962) have theories which 
are explicitly based on this tendency for the 
eye to scan boundaries of objects. In Hebb’s 
system, a reasonably reliable tendency to scan 
along a straight line seems to be presumed. 
The model by Platt (1962) suggests a mecha- 
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nism whereby certain curious facts about the 
random nature of the retina and the various 
motions of the eye are combined to explain 
how these motions help to make possible a 
system of self-addressing or functional ad- 
dressing among neurons, Platt, as well as 
Hebb, has contended that under the rotations 
of which the eye is capable certain neural sig- 
nals will be generated in the retina and higher 
centers of the visual cortex which are charac- 
teristic of straight lines at various orienta- 
tions, curves, equidistance, parallelism, and 
several other basic units. The idea or the evi- 
dence for basic units underlying perception 
can be found in the work of many researchers 
(Cherry, 1961; Hebb, 1949; Hubel & Wiesel, 
1963; Lettvin et al., 1959). Platt makes use 
of relatively well-established phenomena of 
eye movement (Mackworth, 1964; Marshall 
& Talbot, 1942; Riggs, Armington, & Ratliff, 
1954). He is concerned, among other things, 
with the slow drifts (of about 2°- 3° of arc). 
Platt contends that over several drifts—if 
such slow drifts happen to be parallel to, 
Say, a straight line—invariants in input to 
the retina develop over time (i.e., the same 
retinal cells or samples of retinal cells fire). 
These invariants could be picked up by 
higher-order detectors in the visual system. 
Simultaneously, relationships between the in- 
variant events in the visual cortex and associ- 
ated proprioceptive oculomotor signals may 
develop. Such a possibility also has been 
discussed by Gaarder (1963) and Hebb 
(1949). This kind of development might then 
also account for differences reported by 
Bruner (1964) with regard to differences be- 
tween children and adults in the recognition 
of blurred pictures. The former fail to inte- 
grate or to take advantage of redundancy, 
and their search is much more random. 

What Platt is trying to demonstrate is that, 
given the properties of the motoric and cer- 
tain properties of an environment (i.e., simi- 
lar to the properties assumed by Gibson, 
1961, for optic arrays such as sharp bounda- 
ties), certain invariants will develop. These 
invariants, as Platt shows, include identities, 
such as the identical inputs into the retina 
which take place when the eye scans parallel 
to a straight line; but higher-order invariants 
are also found in the preservation of sets 
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.tem, the kind of behavior which interve 


of relations between retinal firing patter 
Thus, for example, if a firing pattern 
retinal cells a, b, and ¢ may be followed 
a time interval (i.e., a motor displacem 
by the firing of cells e, f, and g, this 

sequence may be repeated again after an 
time interval (i.e. a set of motor disp 
ments), and in this case both the en 
sequence and its repeated components will! 
invariants. In fact, these latter invaria 

according to Platt (1962), might den 
stable pattern with symmetry in the v 
input. In this manner, Platt claims to have 
perceptual system which will develop the ¢ 
pacity to perceive a set of a priori perceptt 
units or basic perceptual elements (s 
lines, curves, angles), and eventually will g 
to higher-order invariants, like paralle 

translational comparison (or the capacity 
discover continuity of experience or § 
metry), and several others. All of these p 
nomena outlined by Platt require operat 
which are performed on the afferent i 
Thus, there must be computations at “ 
higher than the retina” to compute whetht 
given sequences of neural signals fired at 
given time are fired again within some ti 
span, that is, after a set of intervening M 
tions. Any constancies discovered are come 
lated with motor movements. In this regaf 
a set of operations must be described for # 
motoric as well. In principle, sensorimot 
sequences can be developed in such a systé 
if, as Hebb and Platt contend, certain me 
movements will in fact tend to be correlal 
1:1 with afferent inputs like straight 
curves, angles, and the like. Note that sil 
motor movements are presumed for the sý 


in a complex perceptual act and its el 
tion with inputs through the visual co 
define a given percept. 
Platt’s theory, in its assumption that me 
perceptual elements develop as a matter 4 
experience, falls, of course, at one extreme l 
the assumptions that can be made. Howe¥é 
others, like Hebb (1937, 1949) and, 
Senden (1960), come close to such a positio 
The first simulation program for perceptu 
development which illustrates sensorimolt 
feedback and coordination is a program wil 
generates for itself the perceptual unit OF 
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straight line. It consists of a retina and a 
motor capacity. The retina comprises the 
equivalent of a fovea and a periphery. The 
motorie feature of the system is capable of 
soving along X and Y axes (omitting in the 
ist stage the Z axis which Platt, 1962, also 
discusses). Inputs, initially, are outline pat- 
ems with sharp boundaries; these patterns 
we composed of horizontal, vertical, and 
danted lines. Some of the input is noise in 
the form of small straight lines which inter- 
wet the main pattern. The program does not 
wlect out the noise because it does not offer 
an extensive invariance. Noise could also have 
een in the form, say, of curved lines inter- 
seting with straight lines. The “eye” is given 
a general tendency to search out any contours 
is its visual field, at first in a random manner, 
and this eye also has tendencies to bring 
into focus those inputs which are in its 
periphery, Aside from its tendency to seek 
out contours and its possession of grosser 
and finer motions, the program possesses no 
further built-in capacities and acts randomly 
at first. 

Given any visual input, the eye will move 
tandomly over such input until either the 
fovea or the periphery “hits” a contour. If 
_ the fovea hits, but the periphery does not, 
and if the hit is not “substantial,” random 
scanning recommences. If, on the other hand, 
the periphery hits a contour, a motion in the 
general direction of the hit ensues. If, during 
Successive moves of such a kind, a large 
Sumber of retinal cells which fired previously 

again, the program brings into play finer 
ye movements which make it possible for a 
mplete overlap of retinal-cell firings be- 
R one move and the next to be attained. 
Sa will become more and more similar 

lirection during such fine scans along a 
“taight line. If, during a set of identical 
“ans, the same retinal cells are fired several 

5 there will be a “facilitation” (to use 
E: 1949 phrase) between sensory and 

„or activity; and the program will keep 
ad in focus, sweeping back and forth 
cons ‘Several times. If it is unsuccessful 

ning complete overlap over successive 

ing a the Program keeps adjusting its focus- 
ail until it does record identity 
firing patterns, It will continue this 
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routine unless it loses its focus, in which case 
it will revert back to random moves and 
essentially repeat the process described above, 

A more detailed verbal description of the 
flow chart for a “straight-line perceiver” is 
provided below for the interested reader, 

The input pattern is a 144 X 144 matrix 
of Os and Is created by first drawing a 
pattern on graph paper and then marking 
each square which contains part of the pat- 
tern with a 1 and each blank square with a 0. 

The scanning pattern or retina comprises 
an area of 36 X 36 and contains two paris—a 
central part or fovea and an outer part or 
periphery, The fovea is located in an area 
14 X 12 in the center of the retina and con- 
sists of 36 cells, each the size of from 2 to 4 
of the squares discussed for the input pattern, 
Retinal-cell size is thus larger than the acuity 
of perception which can be attained, thanks 
to an irregular arrangement of cells in the 
retina (see also Marshall & Talbot, 1942). 
The periphery is a band 3 squares wide 
around the edge of the retina, and is sepa- 
rated from the fovea by a distance of at 
least 8 squares. The periphery (P) is divided 
into 8 sections, Pp through Pz, each of which 
subtends an angle of about 45° from the 
center of the retina. 

A scan‘ of the input pattern by the retina 
is performed as follows: (2) A word, CEN, is 
initialized to contain Os in bits 1 through 36. 
(b) The retina is superimposed on a section 
of the input pattern. (c) If any of the squares 
of the input matrix over which cell i of the 
fovea is superimposed contains a 1, then bit é 
of the word CEN is set to a 1. (d) The words 
Po through P; are initialized at 0. (e) If any 
of the squares over which section j of the 
periphery is superimposed contains a 1, then 
Pj is set to i y 4 

Thus, after a scan, bit i of cen is 1 if and 
only if cell é of the fovea is hit by the 
input pattern, while P; is 1 if and only if 
section j of the periphery is hit by the input 
pattern. f 

The method used for recognizing straight 
lines is, as has been pointed out, an applica- 


4 In the following description, the word scan refers 
not to the movement of the eye, which is referred to 
as move, but to the momentary stationary reception 
of inputs by the retina. 
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tion of Platt’s (1962) functional geometry to 
straight lines, as well as of Hebb’s (1949) 
model for pattern perception, modified for 
reasons to be described below. Platt’s basic 
idea is that if the eye moves parallel to 
a straight line, then the cells of the fovea 
hit by the line at one point in time will be 
hit again after motion has resumed, since 
their positions relative to the line are un- 
changed, The trouble with Platt’s idea is that 
the chances that the eye will move randomly 
in the right direction are small. Two methods 
are used to correct this fault. First, the 
retina does not move unless the section of 
the periphery in the direction of movement 
was hit during the last scan. For instance, 
if Section 1 of the periphery is hit, then the 
retina is “allowed” to move in any direction 
between 10° and 80°. (Each section overlaps 
25° with the next section. Thus Section 2 
allows movement from 55° to 125°.) This 
peripheral feature cuts down considerably the 
chances of making an entirely “wrong” move. 
The second method of finding the “right” 
move is by using a measure of “straightness” 
(M) to tell how close the last move was, 
adjusting the direction of movement until 
either a perfect measure is obtained or the 
measure falls below a certain minimum. 

The recognition algorithm is now explained 
in detail: 

1. A random scan is made. 

2. CEN is saved in cEN1. 

3. If cen contains less than 5 Is or no 
section of the periphery was hit on the last 
scan, start over at Step 1. 

4. A random move, (X,Y), in the general 
direction of a peripheral section hit during 
the previous scan, is generated. 

5. The move (X,Y) is made, the pattern 
is rescanned, and CEN is saved as CEN2. 

6. The intersection, c2 = CEN1 f) CEN2, is 
taken (bit į of c2 is 1 if and only if bits i of 
both cEN1 and cEN2 are 1s, that is, if and 
only if cell ¢ of the fovea was hit on both 
scans). 

7. If c2 contains less than 4 Is, set M 
equal to O and skip to Step 10. 

8. Make the same move (X,Y) again, 
rescan, and record CEN as CEN3. 

9. Take the intersection c3 = c2 f cEN3, 
and then set M equal to the number of 1s in 


GYR, BROWN, WILLEY, AND ZIVIAN 


c3 divided by the number of 1s in c2. (Mi 
the measure of straightness mentioned above, 
and attains a maximum of 1 when c3 = 2) 

10. If M=1, then a straight line with 
slope Y/X has been found—otherwise com 
tinue. 

11. If M < .5, set cEN1 equal to cen3 and 
go back to Step 3. 

12. Set X, equal to X, FY, equal to F, and 

13. If |X;| < |¥i| set AX = 1, AY =0; 
otherwise set AX = 0 and AY = 1. 

14. Set X = X, — AX, and Y = Y, — AF. 

15. Repeat Steps 5 through 10. 

16. Set Xa equal to X, Yz equal to F, and 
Mz equal to M. 

17. Set X = X, + AX, and Y = Y, + áF. 

18. Repeat Steps 5 through 10. 

19. Set X3 =X, Y; = Y, and M, = M. 

20. If both Mz and Msz are less than 4, 
set CEN1 = cEN3 and go back to Step 3. 

21. Find the largest M, M; and next 
largest M, M; of the 3 Ms just obtained. 

22. Set M, = M;, Xı = Xi, Yı = Y; tam 
= Y1/X, and set tang = Y;/X;. 

23. Set A = }| tan; = tans | 

24. Set tan = tan; —A 

25. Find an X and Y such that Y/X ~ ta 
(X and Y must be integers). 

26. Repeat Steps 5-10 and 16. 

27. Set tan = tan; + A. 

28. Repeat Steps 25, and 5-10. 

29. Go to Step 19. 

It may be noted that the number of tè 
sponses is much smaller than the number 
all possible foveal patterns. Consequently, 
when the above procedure is carried out, what 
in essence is done is to partition the fo 
input patterns into equivalence classes where 
two patterns belong to the same class if 
only if they both have the same associated 
response, 

What the above program has, in essencê 
learned is a functional classification system 
based on its own motor system (in this cas 
the scanning movements of an eye). It ™ 
learned to group certain cells together in i 
retina and “interpret” these groupings 1°% 
tive to its own motor system. In this sé 
the organism has not just learned the token 
we call a straight line, but has learned ts 
meaning of a straight line to within the lim! 


COMPUTER SIMULATION 


wt by its own motor sophistication. It is be- 
cause of this linkage between patterns experi- 
esced on the retina and patterns generated 
an its behalf that such an organism may be 
constructing the first phases of an internal 
wmantic for perception. 

What has been described thus far is a 
ænsorimotor system which has already been 
shown, for the simple case of straight lines, 
to be an effective scanner and focuser. When 
the organism is “newborn,” it has no idea 
of what cells lie next to each other or, for 
example, what cells lie in a linear array, etc. 
By utilizing the information gained in the 
aforementioned stage, it can begin to build 
wp an internal code specifying those cells 
that in fact do lie next to each other in a 
linear array on its retina. (The retina which 
bas been described consists of random cells 
and consequently contains no preaddressed 
structure.) Memory for certain retinal in- 
variants and for sensorimotor connections is 
now being developed through experience. 
Later on, higher-order invariants in the 
“visual cortex” will also be developed. Evolu- 
tion of structures isomorphic to the simple- 
type cortical cells that Hubel (1963) has 
found, that is, cortical cells that respond to 
à straight line of such and such an orienta- 
tion, will thus be accomplished. After several 
such developments, it is hoped that the sys- 
tem will become quite adept at searching out 
forms and objects in a noisy environment. 

The building up of internal classification 
Systems is being accomplished by exposing 
the sensorimotor system to the kinds of visual 
puts which allow for the abstraction of in- 
pat like straightness, angularity, parallel- 

ı Symmetry, and the like (see Platt, 1962). 

final example of extensions in this gen- 
— is a program now being con- 
tinin which will recognize contours con- 
at oR several lines of different orientation 
agg including triangles. Given the 

3 Ka on of an input containing, among 
artificial o intersecting straight lines, the 
straight Paganism will at times follow one 
and foll ine, a, and then go around a corner 

r N another straight line, b, with which 
interval line intersects. During the time 
ee covers such an activity, the 

records an invariant series of sen- 
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sorimotor events concerned with the 
of Line a, then a different set of 
concerned with the following of Line b, 
finally the organism may also be 
to record that one invariant sequence 
over into another invariant sequence, that 
that there is a change in activity and in 
sensory input, Thus some form of what K. U. 
Smith and W. M. Smith (1962) have called 
difference detection in the brain is presumed. 
In the light of this treatment, an angle is a 
primitive or a priori element of the system, 
as is a straight line, 

Note that now a triangle becomes a special 
case of a polygon in which there is a specific 
repetition or recurrence of certain primitive 
elements if the eye scans around a triangle 
in a systematic manner. This kind of invari- 
ance can be detected under what Platt (1962) 
calls translational comparison. Likewise, 
closed figures can be discriminated from open 
figures in a similar manner. For reasons of 
space, a systematic treatment of problems 
of higher-order invariance detection cannot be 
given here, and the reader is referred to Platt 
(1962). 
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EXPERIMENTS WITH THE COMPUTER 
PROGRAM 


As has been pointed out, the theory gen- 
erally assumes that certain perceptual ele- 
ments or abstract properties of input remain 
under motion and that more complex per- 
ceptual organization results from some com- 
bination of the more primitive elements. A 
first research effort in connection with the 
program should be to check some of the 
assumptions made by the computer model, 
for example, assumptions with regard to the 
presence and the nature of the so-called 
“perceptual elements,” against known proc- 
esses in human perception. The writings and 
data of Hebb (1949), Lashley (1938), Platt 
(1962), Riesen (1947), and von Senden 
(1960), to mention a few, are pertinent. 

What kinds of predictions or consequences 
might be generated from the simulation model 
of perceptual processes which has been pro- 
posed, which then might be compared with 
data obtained from human beings or other 


living organisms? 
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1. The theory ought to have something to 
say about development, particularly about 
the order in which percepts of various kinds 
are attained. Thus, if research with the pro- 
gram were to show that the capacity to per- 
ceive A involves a set of distinct perceptual 
elements X and the capacity to perceive B 
involves a set of distinct perceptual elements 
Y, where Y is contained in X, then it should 
be true that percept B precedes A in onto- 
genetic development. 

For example, a computer-based proposition 
might be that the discrimination of the class 
of polygons from the class of continuous non- 
angular contours like single straight lines or 
parallel lines, precedes the discrimination of, 
say, the class of trapezoids from the class 
of triangles. The former discrimination might 
require only that angles be discriminated, 
whereas the latter might require the discrimi- 
nation of straight lines of various orientation 
and angles, as well as, possibly, the discrimi- 
nation of different ordered sequences of such 
events, Fantz (1961) has produced some evi- 
dence which suggests at least differential pref- 
erences for certain types of figures among 
very young infants as well as a change in 
such preferences over time. These preferences 
might be based on the perceptual capabilities 
generated by infants over time. 

2. The theory has something to say about 
perceptual organization. It specifies distinct 
perceptual elements, and it can, in principle, 
determine the way in which these elements 
are combined through a given experience to 
form new higher-order units. Such higher- 
order units might be, for example, sequences 
of lower-order events, relations between lower- 
order events, and mumbers of lower-order 
events. A definite organization of perception 
quite clearly follows from the model which 
can be tested against experimental results. 

For example, problems of what constitutes 
an object or a figure, that is, some of the 
Gestalt laws of perceptual organization, might 
be subject to experimentation via these simu- 
lation techniques. One of the Gestalt laws of 
perceptual organization, good continuation, 
perhaps finds expression in the program’s 
tendency to follow through on certain dis- 
covered perceptual units, like a straight line, 
triangle, etc. Stable pattern, connectedness, 
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and symmetry are likewise perceptual or 
zations which are definable in terms of 
order invariants of the theory (see P 
1962). Thus, at least some proble 
perceptual organization might be 
ble to experimentation. Experiments on 
or difficulty of recognizing inputs mig 
conducted. 

A related problem is, perhaps, the probl 
of perception-at-a-glance. This problem, 
in the context of the present theory is] 
to be explainable in terms of develop 
may well be bound up with a great deal 
experience of the perceiver with the 
ceptual world, as a result of which he m 
have been able to identify redundancies 
tween figures and the perceptual 
which have maximum information con 
That such variables are operating in 
psychological information-processing task 
shown in tachistoscopic studies of lang 
recognition (Miller, Bruner, & 
1954). 

Being aware of what higher-order invi 
ants are possible for the system, it might 
possible to predict what the more comp 
memory structures are which may be pres 
in man but perhaps also in other organi 
like the cat or chimpanzee. The theory mi 
therefore, be in a position to suggest furt 
avenues of exploration to the neurophyst 
gist interested in the functions of the vi 
cortex (e.g., Hubel & Wiesel, 1963). 

The theory can be used to test the 
sification of shapes, etc., according to t 
similarity, Those shapes should be most 
lar which have the greatest number of 
ceptual elements in common. However, 
theory would also predict that crite! 
similarity will change in specified ways ¥ 
specific experience. { 

The problem of perceptual generalizai 
likewise can be explored. Depending of 
stage of development, those generalizal 
will be easily made for which there &™ 
a necessary and sufficient set of percepi 
elements or higher-order set of invari 
For example, the problem may be exp 
whether an inverted triangle or an invel 
human figure is still classified appropti 

3. The theory has something to say 4 
attention and selectivity in perceptual 
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yeopment. Thus it makes predictions about 
the kinds of eye movements which will tend 
t accompany a given perceptual act in per- 

development and what features of 
isput will be attended to. Eye-motion studies 
is the tradition of Mackworth (1964) or 
Zone and Michels (1964), using techniques, 
however, which are appropriate to the desired 


age group, could be developed. 

Note that the computer program is in 
principle a tool (barring only failure to 
seccessfully construct it) for the testing of a 


tigorously-phrased theory for many percep- 
tual phenomena. 

It should also be pointed out that the above 
problems require, first of all, research with 
the computer itself in order to establish, for 
eample, what internal organization is re- 
quired for the generation of a percept or 
tapacity by the computer. Following this, 
the computer behavior must then be compared 
with the behavior of living organisms. 
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Since no one can include his own date of 
death in a biographical sketch published 
during his lifetime, the biographical compendia, 
which usually omit the sketch after a person 
has died, therefore fail to provide the date. 
Who Was Who in America gives this impor- 
tant date, but the scope of that volume is so 
broad that its coverage of psychology is 
greatly restricted. Thus it is difficult to find 
the dates of death of prominent psychologists, 
and for this reason the “senior” author of 
the present note has published necrological 
lists in this journal covering the 25-year 
periods 1903-1927 (Boring, 1928) and 1928- 
1952 (Bennett & Boring, 1954). In view of 
the growth of psychology in recent years, we 
thought it would be appropriate now to com- 
pile and publish a new list for the 12-year 
period 1953-1964, and the increased fre- 
quency of death of important psychologists, 
due to the increase in the population of 
Psychologists, has supported our judgment. 
For the list of 1928-1952, the total number 
of names considered for inclusion was 809, 
Whereas for the present one, which covers 
B half as long a period, the total was 875! 
Much the same sources furnished us with 
f a mation: Directories of the American 
a sical Association, Psychological Ab- 
aim and the American Journal of Psy- 
ghd for the United States, whereas for 
Fick) of the world we consulted L’Année 
bane ogique, which under Henri Piéron 
a z own death in 1964 we include now) 
ean the Archives de Psychologie as an 
ay y fruitful source. Moreover, J. C. 
Bian pony Archivist of the British 
tak of ens turned out to be a 
EINE 0, tine i in his own right. From 
other jour e additional names emerged in 
als fe. or were supplied by individu- 
Ow inclu akow kindly undertook to check 
Psychoansion of names for the psychiatric- 
alytic field. 
aie rectories of the APA provided 561 
» Which we submitted to a panel of 


PSYCHOLOGICAL NECROLOGY (1953-1964) 


EDITH L. ANNIN ano EDWIN G. BORING 


Harvard University 


eight persons of varied ages and backgrounds 
but with a broad acquaintance among psy- 
chologists—G. W. Allport, K. M. Dallenbach, 
R. J. Herrnstein, J. McV. Hunt, G. A. Miller, 
R. I. Watson, F. H. Sanford, and the senior 
author. We asked each of them to single- 
check every name that he recognized and to 
double-check every name that he felt was 
worthy of inclusion in the new list, We sent 
these persons samples of the previous lists, 
suggesting that they maintain approximately 
the same levels for inclusion. 

Altogether, 139 of the 561 names received 
at least one double check. One rater double- 
checked 107 names, another only 36, whereas 
the other six raters ranged more or less evenly 
between these two. Starting with these 139 
names, we decided to count two single checks 
as a double check and to accept for inclusion 
every name which had received the equivalent 
of three and a half double checks. This pro- 
cedure yielded 80 names, which constitute 
the major portion of the list for the United 
States. 

Next we turned to the rest of the world, 
and principally (on advice from Paul Fraisse) 
to the Piéron lists in L’Année Psychologique. 
After eliminating duplications, we were left 
with 139 additional names to consider for in- 
clusion. Americans were not rare in L’Année, 
but we had already met most of them in the 
APA lists, although we did find and add a few 
peripheral persons like Adelbert Ames, A. C. 
Kinsey, and Clyde Kluckhohn. The fact that 
Piéron gave at least a sentence and at most 
a page to each of the deceased made it rela- 
tively easy to eliminate otologists and edu- 
cators and to select the important psycholo- 
gists. Not wishing to rely solely on our own 
judgment, however, we consulted H. Caii 
Duijker, J. C. Kenna, Paul Fraisse, the 
Deutsche Gesellschaft für Psychologie, and 
others, giving them the names we had already 
found for their respective countries and ask: 
ing them to delete or add names and to check 
the correctness of dates and professional af 
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filiations. To the 80 we thus 
added 43 names from L'Année and 20 sug- 
gested by our various consultants, bringing 
the number up to 143. 

The American Journal oj Psychology pro- 
vided 13 more, most of them either distin- 
guished foreigners, like Agostino Gemelli and 
R. A. Fisher, or persons from related fields, 
like W. J. Crozier and G. H. Parker, since 
the necrologies of Americans mostly dupli- 
cated names from the APA lists. The final 
22 names came from a number of sources— 
including three from the preceding list (Ben- 
nett & Boring, 1954) because their deaths 
actually fell in 1953. From the near end of 
the time span come 8 names of persons who 
died early in 1965. 

The completed list thus comprises 178 per- 
sons, for each of whom we give below the 
exact date of birth and the exact date of 
death. (The sole exception is Rabaud, for 
whom no amount of effort here or abroad 
could produce the day in July on which he 
died.) The geographical entry is not the place 
of death, which may be unimportant, but the 
final important professional affiliation or loca- 
tion, and is intended to serve mainly as a 
means of identification. 

Following the current list of 178 names is 
a supplementary list of 47 names. Thirty-one 
of these entries correct or supplement infor- 
mation that appeared in one or another of the 
earlier lists (Bennett & Boring, 1954; Boring, 
1928); the remaining 16, which are marked 
with an asterisk, represent persons who died 
before 1953 but whose names have not 
previously appeared—most notably Edward 
Wheeler Scripture, whose death in 1945 seems 
to have been unknown to psychologists until 
1955, 10 full years later (Boring, 1965). By 
far the greater part of the additional informa- 
tion contained in this supplementary list 
comes from J. C. Kenna, to whom we are 
heavily indebted for his indefatigable efforts 
throughout the preparation of this article, 
ALEXANDER, F Mt. Sinai : 

Calif., b. 22 Jan. 1391, d. 8 Paes ae be 
EORGE. LD, Butler E E 
dence, R. I, b. 2 Feb. 1902, d. 29 ane 
Ames, ADELBERT, Jr Dartmouth Coll, b. 19 Aug. 

1830, d. re ph 
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ANGYAL, Anpras, Boston, Mass., b. 21 June m 
d. 31 Dec. 1960. 

Bacay, Excusn, Univer. North Carolina, b, 9 My 
1891, d. 14 Jan, 1955. 

Banister, Harry, Cambridge Univer, b, 12 Ag 
1882, d. 19 Jan. 1963. 

Berse-Cexter, Joun GiLsert, Harvard Univer, È 
19 Mar. 1897, d. 6 Dec. 1958. 

Bentiey, Mapisox, Cornell Univer, b. 18 Jes 
1870, d. 29 May 1955. 

BerxreLD, Stecreiep, San Francisco, Calif., b. 7 Ma 
1892, d. 1 Apr. 1953. 

Bisrinc, Epwarp, Boston, Mass., b. 20 Apr. 18% 
d. 11 Jan. 1959. 

Bran, Cmarwes, Univer. Minnesota, b. 23 Mar, 18 
d. 22 Jan. 1957, 

BLacnowskt, Strran, Univer. Poznan, b, 19 May 
1889, d. 31 Jan, 1962. 

Boner, Davin Pasto, Illinois Inst. Technology, b f 
Nov. 1886, d. 18 Dec. 1961. 

BONAPARTE, Marte, Paris, b. 2 July 1882, d. 21 Sept 
1962. 


Braunsuausen, Nicotas, Univer. Liège, b. 16 Oct 
1874, d. 11 Dec. 1956. 

Brown, Warser, Univer. California, b. 9 Feb. 188 
d. 6 Feb. 1956. 
Brucmans, Henet Jouan Frans Wittes, Univer 
Groningen, b. 5 June 1884, d. 21 Feb. 1961. 
Brunswix, Econ, Univer. California, b. 18 Mat 
1903, d. 7 July 1955. 

Bryan, Witt1ast Lowe, Indiana Univer., b. 11 Now 
1860, d. 21 Nov. 1955, 

Bunter, Kart, Los Angeles, Calif., b. 27 May 187% 
d. 24 Oct. 1963. 

Bujas, Ramiro, Univer. Zagreb, b. 23 Aug. 187% 
d. 3 Oct. 1959, 

Burtoup, Arsert, Univer. Rennes, b. 23 May 188% 
d. 10 Mar. 1954, 

Bykov, Konstantin Mrxnamovicn, Acad. Sciences 
Moscow, b. 21 Jan. 1886, d. 13 May 1959. 

, ANTON Jurus, Univer. Chicago, b. 29 Ja® 

1875, d. 2 Sept. 1956. 

Carr, Harvey A., Univer. Chicago, b. 30 Apr. 1875 
d. 27 June 1954, 

Crarc, Wattace, Harvard Univer., b. 20 July 1876, 
d. 26 Apr. 1954. 

Cricuton-Mitrer, Hucu, Tavistock Clin., Londo® 
b. 5 Feb. 1877, d. 1 Jan. 1959. 4 

Crozer, Wirm Jonn, Harvard Univer, b- 
May 1892, d. 2 Nov. 1955. 

Cutter, Ermer Aucustine Kurtz, Univer. Roches 
ter, b. 11 Oct. 1889, d. 30 June 1961. 

Davis, Rotanp Crarx, Indiana Univer. b. 20 De 
1902, d. 23 Feb. 1961. 9 

Dearsorn, Watter Fenno, Harvard Univer, b. ! 
July 1878, d. 21 June 1955. 

DEUTSCH, ALBERT Lrorotp, New York Downstat 
Medical Center, b. 19 May 1907, d. 18 June 196! 

Deurtscu, Ferrx, Boston, Mass. b. 9 Aug. 138; 
2 Jan. 1964. 4 

EDRIDGE-GREEN, Frepertck Wittram, London, b.t 
Dec. 1863, d. 17 Apr. 1953. b: 

Enctis, Horace Brwett, Ohio State Univers 
1 Oct. 1892, d. 20 July 1961. 


ru, Dzax, Office of Naval Research (Lon- 
deo), b. 22 Jan. 1902, d. 27 Dec, 1959. 
Fraxxiin, Univer. California Los Angeles, 
b 24 Nov. 1892, d. 26 Mar. 1962. 
Paxsersces, SamurL Werren, Univer. Pennsylvania, 
b 4 June 1887, d. 2 May 1956. 
Pours, Ste Rowatp AYLMER, Cambridge Univer, 
b 17 Feb. 1890, d. 29 July 1962. 
Frs, PauL Morris, Univer. Michigan, b. 5 May 
1912, d. 2 May 1965. 
Pivcrt, Joun Cart, Univer. London, b. 13 June 
1884, d. 17 Aug. 1955. 
Perrxet-Beunswix, Erse, Univer. California, b. 18 
Aug. 1908, d. 31 Mar. 1958. 
Prosee-Reicr ann, Frrepa, Chestnut Lodge Sana- 
tarium, Rockville, Md., b. 23 Oct. 1889, d. 28 Apr. 
1957. 
Farre, Douctas Hexry, Richardson, Bellows, Henry, 
k Co, b. 7 Nov. 1891, d. 25 Dec. 1960. 
Furrow, Joux Farqunar, Yale Univer. b. 1 Nov. 
1899, d. 29 May 1960. 
Gaewerr, [Janes Crerx] Maxwett, Oxford, Eng., 
b. 13 Oct. 1880, d. 19 Mar. 1958. 
Gorur, Acostixo [Epoarpo], Universita Cattolica 
del Sacro Cuore, Milan, b. 18 Jan. 1878, d. 15 
July 1959. 
Grsexx, Arxoro [Lucrus], Yale Univer., b. 21 June 
1880, d. 29 May 1961. 
Goooarn, Henry Hersert, Ohio State Univer, b. 
14 Aug. 1866, d. 18 June 1957. 
Gowsrem, Kurt, New York City, b. 6 Nov. 1878, 
d. 19 Sept. 1965. 
Goovrxovcn, Frorence Lavra, Univer. Minnesota, 
b. 6 Aug. 1886, d. 4 Apr. 1959. 
Goreme, Fraxcots Josep, Cour d'Appel, Poitiers, 
b. 14 June 1889, d. 21 Nov. 1959. 
Use, PauL, Univer. Paris, b. 26 June 1878, 
d. 4 Jan. 1962. 
Epwr Ray, Univer, Washington, b. 9 Jan. 
1886, d. 23 Apr. 1959. 
Hasttans, Georce Wirren, Teachers Coll, Co- 
lumbia Univer., b. 29 Mar. 1904, d. 11 June 1955. 
Haves, SamueL Perrms, Perkins Inst. for Blind, 
Watertown, Mass., b. 17 Dec. 1874, d. 7 May 1958. 
Y, Wium, Judge Baker Guidance Center, 
n, Mass., b. 20 Jan. 1869, d. 15 Mar. 1963. 
ACH, Witty, Univer. Heidelberg, b. 26 Feb. 
1877, d. 6 July 1955. 
b N, SR Davin Kennepy, Univer. Edinburgh, 
. 24 Apr. 1884, d. 20 Apr. 1965. 
K, CHARLES Jupson, Univer. Chicago, b. 6 
Oct. 1868, d. 29 Jan, 1960. 
187 MANN, Epuarp, Cambridge, Mass. b. 28 July 
1, d. 31 July 1957. 
ae Harry Levr, Columbia Univer, b. 
May 1880, d. 17 Sept. 1956. 
. James Qurnter, Veterans Admin., 
Washington, D. C, b. 26 July 1900, d. 10 May 


Bae Kart Joun, Univer. Chicago, b. 9 Aug. 
d. 15 Jan. 1954. 


, Cart Iver, Yale Univer, b. 12 June 1912 
d. 16 Apr, 1961, : i 
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Hunw, Waave Scuormzo, Pacific Univer, b. 2 
Oct. 1899, d 11 Dec. 1959. 
Huwres, Warre Samuri, Brown Univer, b. 22 
Mar, 1839, d. 3 Aug. 195. 
Iskart, Hasor Eowaan, Smith Coll, b. 16 Nev, 


July 1882, d. 13 Nov. 1953. 

Jenxiws, Witurame Lesov, Lehigh Univer, b H 
Apr. 1898, d. 25 July 1957. 

Jonyxsox, Buroxp Jrawerre, Johns Hopkins Univer, 
b. 23 Aug. 1880, d. 11 July 1954. 

Jounsox, Harry Mixes, Tulane Univer, b. 16 May 
1885, d. 16 Aug. 1953. 

Joses, [Avrxep) Erxest, London Clinic for Psycho- 
Analysis, b. 1 Jan. 1879, d. 11 Feb. 1958. 

Joss, Haroto Ertis, Univer. California, b. 3 Dec. 
1894, d. 7 June 1960. 

Jowes, Lovo Ancttx, Eastman Kodak Co, bn 
Apr. 1884, d. 15 May 1954. 

Juno, Cant Gustav, Zurich, Switzerland, b. 26 July 
1875, d. 6 June 1961. 

Karxa, Gustav, Univer. Würzburg, b. 23 July 1883, 
d. 12 Feb, 1953. 

KARWOSKI, THEODORE, Dartmouth Coll, b. 23 Sept. 
1896, d. 10 Dec. 1957. 

Karz, Davin, Univer. Stockholm, b. 1 Oct. 1884, d. 
2 Feb. 1953. 

Kerrey, Truas Lee, Harvard Univer, b. 25 May 


1884, d. 2 May 1961. 
Kinsey, Arreo Cmanres, Indiana Univer, b. 23 
Coll, Columbia 


June 1894, d. 25 Aug. 1956. 
Krrsox, Harry DEXTER, Teachers 
Univer. b. 11 Aug. 1886, d. 25 Sept. 1959. 
Lupwic, Zurich, Switzerland, b. 10 Dec. 


Psycho-Analysis, London, 
Harvard Univer, 


ing Inst, b. 9 Mar. 1879, d. 10 July 1957. 

Kretscumer, Ernst, Univer. Tübingen, b. 8 Oct. 
1888, d. 8 Feb. 1964. 

Kris, Exwst, New York City, b. 26 Apr. 1900, d. 
27 Feb. 1957. 

Lacey, Orver, Univer. Alabama, b. 2 July 1916, 
d. 1 Feb. 1961. 

Lanpis, Carney, Columbia Univer., b. 11 Jan. 1897, 
d. 5 Mar. 1962. 

LANGFELD, HERBERT SIDNEY, Princeton Univer. b. 24 
July 1879, d.. 25 Feb, 1958. 

LARGUIER DES BANCELS, Jean, Univer. Lausanne, b. 
3 Apr. 1876, d. 8 May 1961. 

LASHLEY, KARL SPENCER, Harvard 
Labs., Orange Park, Fla„ b. 7 June 1890, 
Aug. 1958. 

LEY; AUGUSTE CHARLES, Univer. Brussels, b. 16 April 
1873, d. 10 Jan. 1956. 

Lixpyer, ROBERT MITCHELL, Baltimore, 
May 1914, d. 27 Feb. 1956. 


Univer.; Yerkes 
d. 7 


Md, b. 14 
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Lovrow, Ratru, Yale Univer, b. 27 Feb. 1893, d. 
24 Dec. 1953. 

Loner, Invixc, Teachers Coll, Columbia Univer, b. 
19 Apr, 1905, d. 25 Jan. 1960. 

Lourrir, Cuauncry McKixtzy, Wayne Univer., b. 
9 Oct. 1901, d. 24 May 1956. 

Maser, Karı, Univer. Würzburg, b. 31 Aug. 1869, 
d. 2 Jan. 1953, 

Marrer, Froxence, Child Relations Clinic, Pomona, 
Calif., b. 6 Dec, 1887, d. $ May 1961. 

Maven-Gnoss, Witty, Univer. Heidelberg, b. 15 Jan. 
1889, d. 14 Feb. 1961. 

Micuorre [vax pew Berck), [Baron] Arsert Ev- 
wasp, Univer, Louvain, b. 13 Oct. 1881, d. 2 June 


1965. 

Mozor, Wartner, Technische Hochschule, Berlin, b. 
3 Sept. 1888, d. 30 May 1958. 

Muenzinore, Kart Frreprici, Univer. Colorado, b. 
28 Apr. 1885, d. 23 Nov. 1958. 

Murcutson, Cart, Journal Press, Provincetown, 
Mass., b. 3 Dec. 1887, d. 20 May 1961. 

Nissex, Henry Wreonorst, Yerkes Labs, Orange 
Park, Fla., b. 5 Feb. 1901, d. 27 Apr. 1958. 

, Crarence Paur, Columbia Univer., b. 16 

Feb. 1882, d. 30 May 1954. 

Ocprx, Rosert Morris, Cornell Univer, b. 6 July 
1877, d. 2 Mar. 1959. 

E, Anpré, Univer. Brussels, b. 19 Nov. 

1898, d. 19 Sept. 1958. 

Orpett, Leow Apcarovicn, Acad. Sciences, Moscow, 
b. 7 July 1882, d. 12 Dec. 1958. 

Osório pe Amema, Micuet, Inst. Oswaldo Cruz, 
Rio de Janeiro, Brazil, b. 1 Sept. 1890, d. 1 Dec. 
1953. 


Parker, Grorce Howard, Harvard Univer, b. 23 
Dec. 1864, d. 26 Mar. 1955. 

Paterson, DONALD GILDERSLEEVE, Univer. Minnesota, 
b. 18 Jan. 1892, d. 4 Oct. 1961. 

Peters, Witueta, Univer. Istanbul, b, 11 Nov. 1880 
d. 29 Mar. 1963. 

Prister, Oskar, Zurich, Switzerland, b. 23 Feb. 
1873, d. 7 Aug. 1956. 

Prérox, Hewnt, Collège de France, b. 18 July 1881 
d. 6 Nov. 1964. 

Pitssury, Watter Bowers, Univer. Michigan, b. 
21 July 1872, d. 3 June 1960. 

Potyax, STEPHEN, Univer. Chicago, b. 13 Dec. 1889, 
d. 9 Mar. 1955. 

Ponzo, Mario, Univer. Rome, b, 23 June 1882, d. 9 
Jan. 1960. 

Popov, Nicotas, Centre National de la Recherche 
regis) Paris, b. 25 Aug. 1888, d. 23 July 

Porter, James Perticr, Ohio Univer., b. 23 Sept. 
1873, d. 15 Sept. 1956. 

Pover, Grorces Paur, Univer, Paris, b. 14 Mar. 
1884, d. 15 Sept. 1958, pos x 

Prapines, Mavrice-Francots, Univer, Paris, b. 2 
Mar. 1884, d. 26 Mar. 1958, $ 

Rasaup, Ertenne, Univer, P; b. 12 t 
d. ? July 1956. sn, PE Teg, 

RADULESCO-MOTRU, Constantin, Inst, Psychol: 
Bucharest, b. 2 Feb. 1868, d. 7 Mar. 1957 7 
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Ra{msozrt, Doxaro Anous, Veterans Admis. Hog, 
Boston, Mass., b. 4 June 1904, d. 8 Aug. 1968 
Raparort, Davip, Austen Riggs Found., Stockbeiig, 

Mass, b. 30 Sept. 1911, d. 14 Dec. 1960. 

Révész, Giza, Univer. Amsterdam, b. 9 Dec. 1 
d. 19 Aug. 1955. 

Reymert, Martin Lurier, Mooseheart and Mose 
haven Labs., b. 10 Nov. 1883, d. 2 June 1953, 
Ricu, Grrverr Josep, Roanoke, Va, Guidase 

Center, b. 16 Oct. 1893, d. 12 Apr. 1963, 

Ricuarpson, Lewis Fry, Paisley Tech. Coll, b i 
Oct, 1881, d. 30 Sept. 1953. 

Rivierz, Joan [née Verratt], Inst. of Psyche 
Analysis, London, b. 28 June 1883, d. 20 May 
1962. 

Rosack, A[pranam] A[arox], Emerson Coll, b 1 
June 1890, d. 5 June 1965. 

Réwerm, Géza, New York Psychoanalytic Inst, & 
12 Sept. 1891, d. 7 June 1953. 

Rusinstein, Sercet L., Univer. Moscow, b. 19 Jwe 
1889, d. 11 Jan. 1960. 

Rucxmick, Curistran ALBAN, Univer. Miami, b 4 
Sept. 1886, d. 7 Jan. 1961. 

Scierrer, Martin, Univer. Kansas, b. 10 June 1% 
d. 19 Oct. 1961. 

Scuiosserc, Harotp, Brown Univer., b. 3 Jan. 19% 
d. 5 Aug. 1964. 

Scorr, Warrer Dm, Northwestern Univer. b ! 
May 1869, d. 23 Sept. 1955. 

Seasnore, Harop Gustav, Psychological Corp, È 
4 Aug. 1906, d. 13 June 1965. 

Stecet, Siwvey, Pennsylvania State Univer., b. 4 Jsa 
1916, d. 29 Nov. 1961. 

Stos, Tuéopore, French School of Anthropology 
b. 10 July 1873, d. 4 Sept. 1961. 

Smarn, Norman Kemp, Univer. Edinburgh, b. $ 
May 1872, d. 3 Sept. 1958. 

Sprancer, Epvarp, Univer. Tübingen, b. 27 Ju 
1882, d. 17 Sept. 1963. 

Stowe, Carvin Perry, Stanford Univer, b. 28 Fee 
1892, d. 28 Dec. 1954. 

Srourrer, SAMUEL Anprew, Harvard Univer, b. É 
June 1900, d. 24 Aug. 1960. d 
Strartox, Georce Matcoum, Univer. California, & 

26 Sept. 1865, d. 8 Oct. 1957. 

Strauss, Eric Benjamin, St. Bartholomew's HoP- 
and Univer, London, b. 18 Feb. 1894, d. 11 J+ 
1961. 

Struck, Ernst, Karl-Marx Univer, Leipzig, b- " 
July 1890, d. 26 Oct. 1954. 

SyMonps, Percivar Marron, Teachers Coll, C% 
lumbia Univer., b. 18 Apr. 1893, d. 6 Aug. me 

Tanstey, ARTHUR Grorcr, Oxford Univer, b- 
Aug. 1871, d. 25 Nov. 1955. ie 

Taytor, FRANKLIN Veazey, Naval Research L# 
Washington, D. C., b. 13 Dec. 1910, d. 12 Ma 
1960. 


TERMAN, Lewis Maprson, Stanford Univer, D- 1 
Jan. 1877, d. 21 Dec. 1956. uè 

THOMsoN, Str Goprrey [Hron], Moray HO 
and Univer. Edinburgh, b. 27 Mar. 1881, d. 9 
1955, j 

Tuurstone, Lours Leon, Univer. North Carolin® 
b. 29 May 1887, d. 29 Sept. 1955. 


Epwaso Caace, Univer. California, b. 14 
1886, d. 19 Nov. 1959. 
Rurn Suermax, Univer, California Las 
b. 10 Oct. 1895, d. 18 Sept. 1957. 
nxn, Cuastes Wirz, Univer, Birmingham, 
16 Aug. 1579, d. 26 May 1964. 
, Waviano Faxsxs, Boston Univer, b. 3 
m 1901, d. 21 Jan. 1961. 
Jeax Gaston, Univer. Strasbourg, b. 3 Nov. 
d. 30 Dec. 1961. 
dxrir, Haxs, Univer, Leipzig, b. 4 June 1886, 
“2 18 Jan. 1964. 
Axszr, Guza, Univer. Cairo, b. 25 Sept. 1891, 
id 10 Jan. 1955. 
Horst, Warrer, Max-Planck-Inst. fiir Ver- 
ensphysiologic, Seewiesen, Austria, b. 28 Nov. 
d. 25 May 1962. 
Wittow, Hexri PauL HyacıxtHeE, Collège de France, 
© 16 June 1879, d. 1 Dec. 1962. 
Fuses, Cart Jons, Columbia Univer, b. 18 Mar. 
1890, d. 28 Feb. 1961. 
Waxane, Tourv, Nihon Univer, Tokyo, b. 7 
pt. 1883, d. 12 Jan. 1957. 
ox, Joux Broapus, Woodbury, Conn, b. 9 
1878, d. 25 Sept. 1958. 
Mis, Frenerick Lyman, Harvard Univer., b. 22 
n 1884, d. 2 June 1964. 
xer, Herz, Clark Univer. b. 11 Feb. 189, d. 
W May 1964. 
erer, Ravmoxp Horper, Babson Inst, b. 9 
. 1892, d. 24 Aug. 1961. 
Meer, Lichtxer, Univer. Pennsylvania, b. 28 June 
1867, d. 19 July 1956. 
irr, Werser, Bard Coll, b. 3 Feb. 1904, d. 18 
ay 1957. 
ress, ALsert Wikram PuiLLIirs, Univer. Read- 
Bg, Eng., b. 24 June 1883, d. 7 June 1961. 
X RTH, Rosert Sessions, Columbia Univer., 
317 Oct. 1869, d. 4 July 1962. 
ABE, Tatsuro, Kyoto Univer., b. 24 Oct. 1893, 
124 Mar, 1958. 
KES, Rosert Mearns, Yale Univer, b. 26 May 
` d. 3 Feb. 1956. 
00RG, Grecory, New York Medical Coll, b. 
Dec. 1890, d. 17 Sept. 1959. 


SUPPLEMENTARY LIST 


isks denote additions to earlier lists; 
arked entries supplement or correct in- 
Mation previously given. 


, Nanziss Kaspar, Univer. Göttingen, b. 29 Oct. 
Sil, d. 25 July 1946. 

MAEFENBURG, Gustav, Baltimore, Md. b. 23 
"AY 1866, d. 2 Sept. 1944. 
May n Berlin, b. 22 May 1843, d. 15 
aoo STEFAN, Univer, Warsaw, b. 4 Feb. 1885, 
pt. 1952, 


Be Pume Boswoop, London, b. 13 Feb. 
d. 1 Nov. 1950. 
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Lroxisoo, Naples, b. 15 Apr. 1648, å U 


Buxcns 

Feb. 1927, 

Becomaxs, Kosasıx:as, Leipeig, b. 17 Nov. 155A 
d. 22 Aug. 1914. 

Bzovon, Joszrn, University Coll, Aberystwyth, b. 
16 Mar. 1852, d ? Dec. 1925. 

*Browx, Wurm, Oxford Univer, b. 5 Dec. 1851, 
d. 17 May 1982. 
Dusxmem, Emure, Paris, b. 13 Apr. 1858, d. 15 

Nov. 1917. 


Eromaxx, Bexxo, Univer, Berlin, b. 30 May 1851, 
d. 7 Jan. 1921. 
Fıscner, Kuxo Exxsr Bexrmoto, Heidelberg, b. 33 
July 1824, d. 5 July 1907. 
Camatxo, Univer. Pavia, b. 7 July 184, 
d. 21 Jan. 1926, 
*Gorvon, Roxar» Grev, Bath, Eng, b. 25 Mar 


1845, d. 4 May 1915. 

Greex, Joux Arreo, Sheffield, b. 15 Oct. 1867, d. 
12 Mar. 1922. 

His, Wines, Leipzig, b. 9 July 1831, d. 1 May 
1904. 


Hrrzic, Eovarp, Berlin, b. 6 Feb. 1838, d. 20 Aug. 
1907. 

Hocu, Avoust, New York City, b. 20 Apr. 1868, 
d. 23 Sept. 1919. 

Horstey, Sm Victor Argxaxper Haron, London, 
b. 14 Apr. 1857, d. 16 July 1916. 

Jacxsox, Jons HvcHtrxos, 4 Apr. 
1835, d. 7 Oct, 1911. 

Lascprecut, Kant, Leipzig, b. 23 Feb. 1856, d. 11 


Logs, JACQUES, 
d. 11 Feb. 1924. 
Lussocx, Jouw [Baros Avesury], London, b. 30 
Apr. 1834, d. 28 May 1913. 
Paoro, Florence, b. 31 Oct. 1831, d. 
26 Aug. 1910. 
i Hexry Rurtoers, New York City, b. 22 
uly 1852, d. 3 May 1927. 
ae ar, Hian, London, b, S Feb. 1834, 4. 24 
Jan. 1918. 
Mircuext, Simas] WER, Philadelphia, b. 15 Feb. 
1829, d. 4 Jan. 1914. 
Mésrvs, Pact Jurus, Leipzig, b. 24 Jan. 1853, d. 
8 Jan. 1907. 
*Paurt, Ricmarp Marta, Univer. Munich, b. 12 May 
1886, d. 22 Mar. 1951. 
Priore, Eouaro Frrexicn Wieners, Bonn, b. 7 
June 1829, d. 16 Mar. 1910. 
*PHILPOTT, STANLEY JOHN paron Univer. London, 
b. 11 Aug. 1888, d. 6 Sept. 1 2. 
*Reap, CarveTH, Univer. London, b. 16 Mar. 1848, 
d. 6 Dec. 1931. / 
*RICKMAN, JOHN, Inst. of Psycho-Analysis, London, 
b. 10 Apr. 1891, d. 1 July 1951. 
*SCRIPTURE, EDWARD WHEELER, Bristol, Eng., b. 21 
May 1864, d. 31 July 1945. 
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*Stenquist, Jons L., Baltimore, Md., b. 9 May 
1885, d. 8 Nov. 1952. 

Strutt, Jonn Wituram [3rp Baron Rayren], 
London, b. 12 Nov. 1842, d. 30 June 1919. 

Sutty, James, London, b. 5 Mar. 1842, d. 1 Nov. 
1923. 

TAMBURINI, AUGUSTO, Rome, b. 18 Aug. 1848, d. 28 
July 1919. 

WALLER, Aucustus Désiré, London, b, 12 July 
1856, d. 11 Mar. 1922. 

WEISMANN, AUGUST FRIEDRICH Leororp, Freiburg, 
b. 17 Jan. 1834, d. 5 Nov. 1914. 

*WELLMAN, Bern Lucy, Iowa Child Welfare Re- 
search Station, b. 10 June 1895, d. 22 Mar. 1952. 

*Wiersma, Enno Dirk, Univer. Groningen, b. 29 
Nov. 1858, d. 24 Nov. 1940. 
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*Wrrrets, Frrrz, New York Psychoanalytic Is, 
b. 14 Nov. 1880, d. 16 Oct. 1950. 

ZIEHEN, THEopor, Univer. Halle, b. 12 Nov. 1882, 
d. 29 Dec. 1950. 
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“Tiche, réussite, 


“direct, automatic, and inevitable.” 


In his capacity as the foremost historian of 
"the law of effect, Postman (1947, 1962) has 
been a major force responsible for many psy- 
cwologists’ continued faith in that law. In his 
1947 review, Postman conservatively con- 
daded that the law of effect had not been 
substantiated, but neither had it been gain- 
sid as a fundamental fact of learning. In 
1962, his conclusions were more decidedly 
favorable to the law of effect: 


In spite of the many empirical and conceptual 
— which still await solution, the basic propo- 
het of Thorndike’s theory [ie., the law of ef- 
oy have weathered with considerable success 

theoretical critiques and attempts at experi- 
=n refutation. Time and again, as in his views 
Punishment and the spread of effect, he ap- 
Pered to have been proven wrong but eventually 
or new support from still further experimental 
ses of these problems. The picture of the learn- 
hag which Thorndike sketched more than 
wm ago is still very much on the books [p. 


pe present paper is directed at two goals: 
© reintroduce some neglected work of 
Belgian psychologist, Joseph Nuttin, to 
-speaking psychologists, and (b) to 
ag how this work strikes at the 
ss Postman’s arguments in favor of 
taw of effect, 


W M 
Thi Bie on this paper was greatly facilitated by a 
ae Public Health Service postdoctoral 
Pay to the author, administered by Educa- 
esting Service. 


punishment in intentional and incidental learning and on the spread of 
are discussed in relation to current trends in American theorization on these 
topics, It is concluded that Nuttin’s work in these areas constitutes a vigorous 
and as yet unrefuted attack upon the notion that the effects of rewards are 
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NUTTIN’S NEGLECTED CRITIQUE OF THE LAW OF EFFECT ' 


ANTHONY G. GREENWALD 
Ohio State University 


This paper presents and evaluates material from Joseph Nuttin's (1953) book, 
et échec.” Nuttin’s experiments on the roles of reward and 


effect 


Approximately 100 of Nuttin’s experiments 
on the law of effect are described in the 1953 
book, Tâche, réussite, et échec (Task, suc- 
cess, and failure). Although important seg- 
ments of this work had been published pre- 
viously in English-language journals (Nuttin, 
Nuttin’s work has 


1947, 1949), nonetheless, 
almost never been discussed in even moderate 
work on 


detail in any major English-language 
learning. Dollard and Miller (1950) and 
McGeoch and Irion (1952) refer briefly to 
Nuttin (1947) in footnotes, and Hil 

(1956) presents a summary of one of Nut- 
tin’s (1949) experiments; in a review article 
on the spread of effect, Marx (1956), a rare 
exception, devotes two pages to a critique 
of Nuttin’s (1949) work. Elsewhere, i 


difficult to find even a passing reference to 
this rather large 


body of work on the law 
of effect. In the present paper, Nuttin’s 
contributions are discussed in three cate 
gories: (a) the effects of reward and punish: 
ment in intentional learning, (b) the effect 
of reward and punishment in incidental learn 
ing, and (c) the spread of effect. 


Prefatory Note: Nuttin’s Dependent Meas 
ures 

Nuttin’s studies on the roles of reward an 
punishment in human learning have use 
serial learning procedures similar to thos 
frequently used by American psychologists 1 
studying such problems, but have used di 
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pendent measures of learning decidedly un- 
common in American studies, Nuttin fre- 
quently gives his subjects (Ss) only one 
trial-and-error run through a serial list of 
learning items before testing them for learn- 
ing. This test then usually involves asking 
Ss to repeat all responses made on the single 
acquisition trial, no matter whether those 
responses had been rewarded or punished. 
Thus, Ss are asked for intentional repetition 
of errors as well as of correct responses. 
Nuttin considers this type of measure su- 
perior to the more customary procedure 
of observing unintentional repetition of er- 
rors during intentional repetition of correct 
responses, 

In the present writer’s opinion, Nuttin’s 
type of measure is highly justified for the 
purpose of comparing the effects of reward 
and punishment since it tests what is learned 
from both under exactly parallel conditions. 
The more customary procedure, by com- 
parison, seems biased against observing any 
learning that may occur under punishment, 
since S's task of changing a punished re- 
sponse may often be decidedly more difficult 
than that of repeating a rewarded response. 
Nuttin’s method uses the same repetition task 
for observing the effects of both reward and 
punishment and, frequently, he also asks Ss 
to recall whether each response was rewarded 
or punished. As will be apparent below, 
Nuttin’s consistent use of this type of de- 
pendent measure is essential to the type of 
results he obtains, 


EFFECTS OF REWARD AND PUNISHMENT IN 
INTENTIONAL LEARNING 


Chapter VII of Tâche, réussite, et échec 
describes 12 experiments exploring the rela- 
tive effectiveness of reward and punishment 
when Ss are given explicit instructions to 
retain correct responses (i.e., intentional 
learning). The general procedure for most of 
these experiments included a brief acquisition 
period in which Ss were presented with a 
serial list of stimuli and allowed to generate 
trial-and-error responses, some of these 
being called right and others wrong. In the 
subsequent retention test, Ss were asked to 
reproduce all of their first-trial responses 
(right and wrong) after having been led to 
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believe that they would be required to) 
and repeat only the right ones. The e 
result of superior retention of reward 
sponses was obtained in all of these 
ments. However, some of the variatia 
this procedure gave results that led 
to reject the idea that it was an 
effect of reward (as postulated in the 
effect) that led to differential retentiot 
For instance, in one experiment, $$ 
shown 100 pairs of words on a memory: 
each pair being presented for 2.5 
followed by a .25-second signal ind 
whether or not S was to be tested lat 
that pair. Immediately following the $ 
the next pair of words was presented, the 
(supposedly) leaving no time for a 
hearsal of the 50 to-be-recalled r 
were later tested (contrary to expect 
on all 100 pairs by being shown thé 
member of each pair and being ask 
reproduce the second member. This € 
ment can be criticized in that the signal 
to indicate a recall test was the word 
(good), and mal (bad) signaled that} 
would be no recall test. Of course, this d 
of signals detracts from Nuttin’s condi 
that it was incorporation verus nomi 
poration in a system of “persisting 
tension” (rather than reward) that res 
in the observed superior reproduction 0 
to-be-recalled pairs, Further, it is not | 
that the 2.5-second word-pair present 
period did not allow Ss some time to reh 
a previous pair that was to be recalle 
These faults were avoided in a subseq 
experiment that used a task in which $ 
to estimate the number of units preset 
photographs of various homogeneous C0! 
tions of objects (e.g., a flock of sheep, # 
of trees, a rack full of bicycles, etc.). 
a preliminary stage, the experimente 
asked § to guess the number of ol 
each of 40 photographs. S was not ® 
“tight” or “wrong” for any of these gue 
but was told for 10 of the 40 photos” 
they would be seen again during the exp 
ment. § was led to believe that the remi 
ing 30 photos were not to be used later. 
retention test for all 40 photos, Ss were De 
able to reproduce their original guess 
the 10 “later-to-be-used” ones than 


@ “not-to-be-reused” ones. (This test was 
eeducted with Æ describing the type of ob- 
jet in the photograph, e.g., sheep, trees, bi- 
and S recalling his guess; S was not 
“shown any of the cards a second time.)* 
Nuttin concluded from these experiments 
that Ss’ expectations of having to repeat re- 
sanded responses, and of not having to repeat 
i responses, may be sufficient to ac- 
wnt for the sup:riority of reward over pun- 
šdment usually cbserved in serial intentional 
kaming tasks. It should be noted that the 
data so far sumarized are not sufficient to 
mject the notion that rewards have automatic 
srengthening effects; rather, they demon- 
strate support for a plausible alternative hy- 
pothesis (similar to that for the Zeigarnik 
dict) in terms of “persisting task tension.” * 


Errects or REWARD AND PUNISHMENT IN 
INCIDENTAL LEARNING 


Nuttin elaborates his explanation of the 
typically observed effects of reward and 
punishment in terms of what he calls tâches 
vertes (open tasks) and tâches fermées 
(dosed tasks). In a serial learning experi- 
ment, rewarded items constitute open tasks— 
Sexpects to be asked to reproduce his re- 
onses for these items; punished items con- 
stitute closed tasks—S expects not to have 


io light of the possibility of “isolation” arti- 
(cf. Wallace, 1965) intruding into this type of 
—— it would have been preferable for Nuttin 
instruct Ss that 50%, rather than 25%, of the 
Sinuli Would later be encountered. However, since 
€xperiment did not serve as the major eviden- 
Wet for any important conclusions, this point 
îl od Importance to the present argument. — 
a interesting to note that one of the experi- 
reported in Chapter VII (pp. 369-376) of 
» réussite, et échec approximately anticipated 
| Sonia of an experiment reported three 
ter by Bitterman (1956). This was a paired- 
te experiment in which Ss were led to believe 
Bey acy responded-to- frst-telal stimuli (half 
first-trial stimuli) would not be reencount- 
gas incorrectly-responded-to stimuli (for 
as immediately supplied the correct re- 
Sm), would be reencountered. On a second trial, 
the pa (contrary to their expectations) for 
foung that response for all of the stimuli. It was 
their fd they did better on the stimuli for which 
Gated ponse were called wrong. This indi- 
at information received in the context of 
ment ies could be well retained, when punish- 
to generate a persisting task tension. 
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to repeat his responses to these items. The 
“persisting task tension” of the open tasks 
(rewarded items), rather than any automatic 
effect of reward, is then assumed to be re- 
sponsible for Ss’ superior ability to reproduce 
rewarded responses, 

In Chapter VI of Tâche, réussite, et échec, 
Nuttin tests this hypothesis by setting up a 
closed-task (i.e. incidental learning) situa- 
tion for both rewarded and punished items. 
A typical experiment used the above-men- 
tioned task of guessing numbers of objects 
in photographs. Ss were asked to estimate 
the numbers, being told “right” for half of 
their estimates and “wrong” for the other 
half, After having been led to expect that 
their task ended with this first series of 
guesses (i.e., a closed task), Ss were 
on their ability to recall the numbers they 
had originally guessed. Again, the retention 
test was conducted without S 
stimulus photos a second time. The 
finding for a series of such experiments was 
that rewarded and punished responses were 


the similarities between 
results of their studies (which differed from 
using an ESP task to 
generate the “closed-task” atittude) and 
those just described. The conclusion in either 
case was that there is no automatic effect of 
reward on stimulus-response connections and 
that the typical effect of reward in serial 
intentional learning experiments is an artifact 
of Ss’ intent to learn (ie., the open-task 
specific to reward ite 
Postman (1962, pp. 392-396) argues vigor- 
ously against such an interpretation. He cites 
data from three experiments (Postman & 


should be recognized that all three studies he 
cited are subject to criticism on methodo- 
logical grounds. used an ESE 
procedure in which only 10-15% of Ss 
first-trial responses were called “right.” The 
ensuing superior recall of these right re 
sponses could very easily be attributed ti 
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a perceptual-isolation (or von Restorff) effect 
(cf. Wallace, 1965). Postman and Adams 
(1955) ran additional conditions, supposedly 
to control for such perceptual-isolation ef- 
fects. In these additional conditions, Ss were 


ever, one must question a control for per- 
ceptual isolation that is based on making an 
event (being correct on an ESP guess of a 
digit between 1 and 10) with a chance prob- 
ability of 10% occur 50% or 85% of the 
time. Certainly, the Ss in such a situation 


learning, the weight of Nuttin's (1953) evi- 
dence in favor of the contrary (non-differen- 
tial-learning) conclusion is impressive. Nuttin 
used several different closed tasks, none in- 
volving ESP, on a total of 930 Ss in 16 
experiments, all demonstrating no differential 
originally rewarded and 
Since Nuttin rewarded 


| 


i 


the quantity of his 
data and the neatness and simplicity of his 
experimental procedures.‘ 


SPREAD OF EFFECT 


* Nuttin describes a small fraction of his data on 
the relative effects of reward and punishment in 
both intentional and incidental learning in his 1947 
article. That article, unfortunately, passes over most 
of this material so quickly that the force of the 
presentations in Chapters VI and VII of Tâche, 
réussite, et échec is virtually entirely absent. 


previously in English (Nuttin, 1949), Hi 
ever, part of the impact of this work ™ 
lost owing to the omission of some essesi 
data from the 1949 article. 

Nuttin’s basic experimental situation § 
the study of the spread of effect used a tal 
similar to one used by Thorndike, in whith 
had to learn a “foreign” language by mah 
trial-and-error choices among 5 nonsem 
word alternatives for each of 40 stimi 
words, After only one trial through thei 
(S being informed “right” for a predet 
mined 25% of his choices), the task = 
interrupted and S was asked to reproduce} 
first-trial response for each of the 40 stim 
and to state whether Æ had declared th 
response to be right or wrong. Nuttin’s fi 
ings, using this situation on a total of J 
Ss, were that (a) there was no difference 
ability to recall, upon request, punished ! 
sponses that originally occurred one itt 
away from a reward as opposed to those th 
occurred two or more items away, and (i 
there was a greater tendency to recall 
punished response as having been 
if it was located only one item away from 
reward as opposed to a punished response: 
a distance of two or more items. (The form 
finding was, unfortunately, omitted from 
1949 paper; it is presented on pp. 473) 
of the 1953 book.) Nuttin interpreted the 
findings as showing that the spread of efit 
may not be due to diffusion of the automat 
effects of rewards, which should have Pf 
duced a gradient on the original-resp™ 
recall measure, but is due simply to Ss’ @ 
fused memory of the positions in the list 
which rewards occurred. 

A variation on the above procedure ¥ 
used by Nuttin in five additional exP® 
ments to provide even firmer support 
these conclusions, As in the just-d 
experiments, the Ss were tested for 
both original responses and_reinforcemet! 
However, without informing the Ss, £T 
arranged the sequence of items on T 
(the test trial), leaving the previous 
rewarded items in their original position 
but switching punished items that were P 
viously close to reward with ones previ 
relatively distant from reward. Again, 
was no differential recall of original! 


responses as a function of their 
ins) distance from reward (Nuttin, 1953, 
476). Again there was a differential recall 
panished responses as having been re- 
depending on their distance from 
; however, it was their nearness to 
on the second, rather than the first, 
tation of the list that produced an 
in the tendency to recall punished 
as having been rewarded! (Recall 
responses near to reward on Trial 2 
tad been relatively distant from reward on 
Tal 1.) This latter finding strongly indi- 
that the tendency to recall punished 
Mponses as having been rewarded was due 
t Ss’ tendencies to err in recalling the serial 
ikations of the rewarded items in the list 
wd not to any diffusion of “automatic” ef- 
iets of rewards. It is but a short further 
Wp to the conclusion that “spread of effect” 
jlenomena may be produced as an artifact 
these tendencies. 


A Reply to Marx 


Marx’s (1956) review of literature on the 
fread of effect attempts to document the 
Wpument that a spread of effect is found 
‘fn when all possible sources of artifact are 
Sperimentally controlled. Marx (1956, pP. 
188-160) parries the thrust of Nuttin’s 
(1949) troublesome data with two argu- 
Mts: (a) Nuttin’s account of spread phe- 
mena is actually indistinguishable from 
’s; and (b) Nuttin’s use of meas- 
Wes of learning that hinge on Ss’ ability to 
Previous wrong responses as well as 
ones is both counter to tradition and 
‘ufliciently justified. Both of these argu- 
_ "eats of Marx’s do injustice to Nuttin’s 
E the first place, Nuttin’s cognitive 
ea tation of the spread of effect is very 
“tly distinguishable from Thorndike’s 

Y S-R position. Nuttin rejects an 
tion in terms of an automatic 

of reward. In fact, Nuttin’s (1949) 

i nt in which the positions of punished 
E~ interchanged on the retention test 
Matias confrontation between his own 
Wace a and Thorndike’s. In the second 
tis points also our prefatory discussion of 
tponse ), the use of the recall-of-original- 
measure is very well justified by 
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Nuttin (1953) in terms of his acute aware- 
ness of the learning-performance distinction; 
the traditional measure of involuntary repeti- 
tion of errors is simply not capable of dis- 
tinguishing between (e) strengthened S-R 
association on the punished item (an event 
occurring on the acquisition trial) and (6) 
mistaken recall of the item's reinforcement 
(an event occurring on the test trial). 
Marx, y, does not limit his at- 
tack on Nuttin’s position to such “theoretical” 
arguments; he buttresses the attack by citing 
the studies of Muenzinger and Dove (1937) 
and Marx (1957), in which attempts were 
made to control the source of artifact sug- 
gested by Nuttin—S's uncertainty about the 
locations of rewards in a serial list. Muen- 
made certain that one of 


p 
Hi 
FE 


3 
d 


give the correct respons: 
failingly. As a result of the latter fact 
spread of effect observed in 
group cannot be used as an argument 
the “uncertainty” artifact; it is well recog- 
nized (Marx, 1956; Postman, 1962) that 


ae 


ji 


the rewarded 
spread of effect resulting from 
guessing sequences. Marx’s (1957) — 


methodological defect; 
have cited the data | 
support rather than in rebuttal of Nuttin’s 
arguments since, in fact, the spread of effect 
observed in that group was not significantly 
different from that observed in a guessing- 
control È 
EET pats should make it 


These various arguments 
clear that Marx’s refutation of Nuttin is 


virtually without substance. Further, it should 
also be stated that, despite the rather excep- 
tional detail in which Marx (1956, pp. 158- 
160) presented Nuttin’s case, it cannot be 
considered that he presented it entirely 
fairly. He passed very briefly over the crucial 
experiment (Nuttin, 1949, Experiment II) in 
which the positions of punished items were 
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rearranged on the test trial, and, of course, 
he made no reference to the more extensive 
documentation of Nuttin’s argument that 
appears in Tâche, réussite, et échec. 


A Reply to Postman 


Postman’s (1962) review attempts to leave 
the reader with very much the same faith 
in the spread of effect that Marx displayed. 
It is very surprising that Postman made no 
mention of Nuttin’s work in this recent re- 
view, since the fact that Postman and Adams 
(1954) cited Nuttin (1949) in a footnote 
indicates that Postman must have been aware 
of Nuttin’s work, Further, Postman and 
Adams (1954, 1955) had collected data that 
could have been used as an empirical check 
on Nuttin’s hypotheses, but were not. That 
is, Postman and Adams’ Ss were given a 
test trial on which the positions of rewarded 
items had been left constant while those of 
punished items were scrambled, and their Ss 
were asked, as were Nuttin’s, to recall both 
their original responses and the reinforce- 
ments received for them. It is not clear from 
their report whether or not Postman and 
Adams’ Ss were informed of the revised test- 
ing order of the items—a potentially critical 
point. 

It should be noted that Postman (1962) 
did comment on a position vaguely similar 
to Nuttin’s—specifically, Zirkle’s (1946) ar- 
gument that the spread of effect is an artifact 
of the Li isolation of rewarded items 
in a serial list. In discussing Zirkle’s argu- 
ments, which in themselves are not a an 
of present concern, Postman stated that 


effects [p. 377]. 


The second part of this “highly implausible 
conclusion” may be Tecognized as the crux 
of Nuttin’s argument—< . . that proxim- 
ity to an isolated reward . + . Systematically 
distorts memory for after-effects.” Postman 
should not have considered this to be im- 
plausible, not only because of his presumed 
awareness of Nuttin’s data in support of this 
conclusion, but also because data from some of 
his own studies (Postman & Adams, 1954, 
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p. 625; 1955, p. 104) also showed 
same systematic distortion in memory i 
aftereffects as a function of proximity 
an isolated reward. 

It should be noted, in fairness to 
Marx and Postman, that neither of 
treats the spread of effect as an i 
established theoretical concept. Marx (1 
particularly, is quite explicit in eo 
fact that research remains to be è 
order to settle the issue. 


Impact or Nutrin’s Finpinos 


It probably would have been difficult, f 
not impossible, for Postman (1962) to anis 
at conclusions so favoruble to the lawa 


ones, and (b) that when all sources of #® 
fact are controlled, a residual spread of eft 
still may be found. Both of these argumest 
are vital in support of the notion that i 
action of rewards is “direct, automatic, 
inevitable [Postman, 1962, p. 396].” Ni 
work, as has been shown, strongly 
mines both of these supporting af, 
Nuttin repeatedly found (a) that in È 
absence of an intent to learn, rewarded 
punished associations are equally well ™ 
tained, and (b) that spread-of-effect 
ments are susceptible to a usually un™ 
trolled artifact stemming from Ss’ inability ® 
recall the precise locations of ri 
serial lists, A 
Those who have argued for the validi ' 
of Thorndike’s law of effect have not W 
met (and, indeed, may have great diffic® 
meeting) the challenge posed by N a 
findings. Studies demonstrating ere 
effects of reward and punishment “ue 
dental learning suffer, as has been hat 
here, from potential sources of artifact ™ 
have been well controlled in Nuttin’s $ 
demonstrating nondifferential learning. P © 
case of the spread of effect, the defni 
(iey artifact-free) experiments have not | 
been done. el 
Part of the force of Nuttin’s findings 


sd from his use of dependent measures 

earning used only rarely in American 
The validity of Nuttin’s measures 
been defended briefly in this paper. A 
extended debate on this topic might, 


the present author’s opinion, prove highly 


is hoped that this brief presentation 
W some of Nuttin’s work will lead students 
i learning to look more skeptically at the 
of effect and, further, to examine Nuttin’s 
with the care which it deserves. 
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INTERNAL VERSUS EXTERNAL CONTROL OF REINFORCE] 
A REVIEW 


HERBERT M. LEFCOURT * 
University of Waterloo 


A summary of research concerning the construct, internal vs. external control 
of reinforcement is presented. Investigations with this variable have utilized 
situational manipulations of locus of control or have involved differential 
predictions to given situations based on measures of the internal-external 
control dimension. In both types of investigation, locus of control is found 
predictive to different social behaviors, learning performances, and to more 
and less achievement-related activities. Suggestions for further areas of study 


are presented. 


Under various rubrics, and from diverse 
orientations, investigators have concerned 
themselves repeatedly with man’s ability to 
control his personal environment. Concepts 
such as competence, helplessness, hopeless- 
ness, mastery, and alienation have all been 
utilized in one way or another to describe 
the degree to which an individual is able to 
control the important events occurring in his 
life space. 

The theorist who has most extensively 
written about the overcoming of helplessness 
and the development of mastery is Alfred 
Adler (Ansbacher & Ansbacher, 1956), 
Adler’s concept of “striving for superiority” 
is posited as a universal, basic motive de- 
riving from man’s inherent, initial inferiority. 
As opposed to popular distortions of Adler’s 
superiority concept, Adler’s concern was for 
man’s becoming more effective in controlling 
his personal world. R. W. White’s constructs, 
which he called competence and effectance, 
(White, 1959) can be viewed as describing 
the same referents as Adler’s superiority 
striving. 

Research conducted by Richter (1959) and 
Mowrer and Viek (1948) has been concerned 
with this area of interest in studies of animal 
behavior. Richter reported that even vigorous 
animals, when placed in situations where no 
solutions (escape) were possible, ceased ef- 
forts and rapidly succumbed to death. Con- 


1 The author would like to express appreciation to 
Irwin W. Silverman for his critical reading and 
encouragement throughout the Preparation of this 
paper. 
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trolling for all alternative hypotheses, R 
concluded that the loss of hope (of 
able to effect a change) was the ¢ 
variable. at 

In an investigation of helplessness, Mi 
and Viek (1948) found that matche 
of shock-controlling and shock noncont 
rats differed in eating inhibition afte 
shock periods. They concluded that al 
controllable painful stimulus arouses 4 
prehension that this stimulus could I 
definitely or get worse, whereas the 
stimulus, if subject to control, arouses | 
or no apprehension. Mowrer labels this 4 
hension of uncontrolled pain as “fear fro 
sense of helplessness.” 

Common to Richter’s, Adler’s, 
and Mowrer’s formulations is the em 
on instrumentality, the strength of © 
gency between acts and their effe 
four theorists stress the impo 
instrumentality for survival and 
behavior, 

It is the purpose of this paper to pres 
the background and research on a com! 
labelled internal-external control of re 
ment which has facilitated the explorat 
this problem of contingency between 4 
effect. 

The internal-external control com 
(subsequently referred to as “control” 
fers from the aforementioned concepts ( 
lessness, helplessness, competence, etC:) 
being an integral unit of an e abo 
theory. It is an expectancy variable | 
than a motivational one (as is White's 
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petence, for instance), In Rotter’s social 
karning theory (Rotter, 1954), the potential 
for any behavior to occur in a given situa- 
tion is a function of the person’s expectancy 
that the given behavior will secure the avail- 
able reinforcement, and the value of the 
available reinforcements for that person, In 
a particular situation, the individual, though 
desirous of an available goal, may believe 
that there is no behavior in his repertoire that 
will allow him to be effective in securing the 
gal, Within this specific situation, the person 
may be described as anticipating no contin- 
gency between any effort on his part and the 
end results in the situation. This description 
of an external-control expectancy is not 
merely applicable to the extreme punishing 
Situations described by Mowrer and Richter 
but can be seen as applicable in many events 
in most persons’ lives, for example, after 
Wagering on a horse at a race track, only 
very odd persons may entertain the belief 
that they can exert some control over the 
outcome (legally). In Rotter’s theory, the 
control construct is considered a generalized 
_ Sxpectancy, operating across a large number 
of situations, which relates to whether or not 
the individual possesses or lacks power over 
what happens to him. Throughout this article, 
individuals are labelled external controls when 
they are said to have a generalized expectancy 
that reinforcements are not under their con- 
tol across varying situations. In layman’s 
uage, these persons may be described as 
ab ng self-confidence, or in Adler’s termi- 
°gy, suffering from inferiority feelings. 
In the first expository paper dealing with 
© control dimension (Rotter, Seeman, & 
i TAR 1962), the construct was described 
istributing individuals according to the 
sii to which they accept personal respon- 
ty for what happens to them. As a gen- 
age internal control refers to the 
© bein on of positive and/or negative events 
ed; 5 P consequence of one’s own actions 
kontrol cby under personal control; external 
refers to the perception of positive 
etapa events as being unrelated to 
ee viens in certain situations and 
an eyond personal control. 
€scribing the research completed with 
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the control construct, this paper is divided 
into two sections, The first presents findings 
from experiments in which task structure was 
varied, inducing a specific expectancy of high 
or low control, The second section presents 
the results of experimentation with per- 


ceived control as a generalized expectancy 
(personality characteristic). 


Internal-External Control as Determined by 
Task Structure 


The earliest published report of task struc- 
turing of control from a social learning theory 
framework is that of Phares (1957). Phares 
gave one group of subjects instructions which 
emphasized that success on a task (color or 
length-of-line matching) was due to skill. A 
second group was given instructions which 
emphasized that success on the same task 
was due to chance, Despite the fact that 
all groups received the same number and 
sequence of reinforcements, subjects with skill 
directions changed expectancies more fre- 
quently and more in the direction of previous 
experience (fewer unusual shifts such as 
raises in expectancy following failure or de- 
creases in expectancy following success). 
Phares concluded that his findings support 
the view that “categorizing a situation as skill 
leads the subject to use the results of his 
past performance in formulating expectancies 
for future performances.” 

A second study in this series was reported 
a year later (James & Rotter, 1958). In this 
investigation, the effects of partial versus 
100% reinforcement schedules upon trials to 
extinction was explored with reference to a 
skill versus a chance-task categorization. As 
in Phares’ experiment, a task (a simple card- 
guessing problem) was used in which success 
was completely controlled by the experi- 
menter although it could appear noncon- 
trolled to the subject. Subjects in chance- 
and skill-direction groups were instructed that 
success was controlled by chance or by their 
own skill, respectively. The findings revealed 
that under the skill condition the usual 
superiority of partial reinforcement for re- 
sistance to extinction did not obtain, In fact, 
under these conditions the 100%-reinforce- 
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ment schedule led to less (though not signifi- 
cantly so) rapid extinction than the 50%- 
reinforcement schedule. The chance condition 
produced findings typical of prior partial re- 
inforcement studies: the 100%-reinforced 
chance group was significantly quicker to 
extinguish than the 50%-reinforced chance 
group (and the 100%-reinforced skill group). 
The 50%-reinforced chance group was sig- 
nificantly slower to extinguish than the 
50%-reinforced skill group. 

James and Rotter explain their findings 
on the basis of subjects’ perceptions or cate- 
gorization of the task: in chance (externally- 
controlled) situations, the change from 100% 
to 0% reinforcement clearly signals a change 
in the situation (experimenter’s manipula- 
tion). Consequently, extinction or a change 
in behavior is rapid. The partially-reinforced 
chance condition, however, does not allow for 
the quick perception of a changed situation. 
Consequently, extinction is more gradual until 
the change becomes evident to the subject. 
Under skill conditions, subjects would be 
more likely to explain the nonreinforced 
extinction trials as reflecting their own lack 
of skill rather than reflecting changed opera- 
tions in the task, and they might therefore 
persist in an attempt to improve their per- 
formance. Consequently, James and Rotter 
attribute the lack of difference between par- 
tial and 100% reinforcement on extinction 
trials with skill directions to the way subjects 
perceive the task, demonstrating the impor- 
tance of the subjects’ expectancies of internal 
or external control, 

To make the James and Rotter (1958) 
findings more analogous to classical extinction 
studies in which responses rather than verbal 
statements are obtained, Holden and Rotter 
(1962) replicated the James and Rotter study 
with the 50%-reinforcement groups using a 
direct-response technique (betting) rather 
than the stated expectancy method. Briefly, 
these investigators found the same effects of 
skill and chance instructions when money 
betting was used to index experimental 
extinction, skill subjects taking considerably 
less time to extinguish than chance sub- 
jects in the partial-reinforcement extinction 
experiment. 
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A second experiment by Rotter, Livers, 
and Crowne (1961) sought to replicate the 
James and Rotter findings without using & 
ferential instructions as the experimental m» 
nipulation. In this investigation, subjects were 
presented with either of two tasks whith 
would be regarded as skill- and change 
controlled tasks on the basis of the preview 
cultural experiences of the subjects, One task 
involved a motor-skill apparatus (vertical 
Level-of-Aspiration Board) while the othe 
involved the card-guessing procedure used i 
the James and Rotter (1958) study, Th 
results strongly supported the hypotheses that 
greater increments and decrements in verbal- 
ized expectancies would be found under skil 
conditions and that extinction of expectancies 
under continuous negative reinforcement tt 
verses under chance and skill conditions (8 
50%-reinforcement group is more resistant 
to extinction than the 100% group only under 
chance conditions, while the reverse holds 
true under skill conditions). The interpreta- 
tions drawn from these findings are similar to 
those in the James and Rotter (1958) article. 
The one exception to the findings is that 
unusual shifting was not more common % 
either group. However, the low number of 
training trials (nine) may have atten 
a possible distribution of unusual shifts. k 

Blackman (1962) undertook an investigė- 
tion to determine whether the apparent pêt- 
terning of events, as opposed to short, seem- 
ingly nonpatterned sequences, in the presenta 
tion of two flashing lights would lead to & 
tinction results similar to those found under 
skill and chance conditions. He reasoned 
long or patterned sequences would lead 4 
subject to believe that predictions of the event 
could be made depending upon his skill t0 
comprehend the pattern, whereas short $ 
quences would lead the subject to perceive 
the patterns as unpredictable (external com 
trol). Blackman found that sequence | f 
and number of sequences significantly ®* 
fected the number of “wrong” guesses (sele 
tion of the light that was no longer 
illuminated during extinction) throughott 
extinction trials. The more sequences $ 
the shorter the sequences during ra 
the more wrong responses were given, 4” 
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the greater was the expectancy associated 
with these wrong responses found during 
extinction. Subjects who were given the 
“kill” or internal-control facilitating se- 
quences were found to make fewer errors, or, 
in other words, to adapt to the new sequencing 
in extinction more readily. This study, how- 
ever, contained no description of reinforce- 
ment schedules so that the results were not 
directly comparable with the previous studies. 
Secondly, trials to extinction (the complete 
elimination of wrong responses) were not sig- 
nificantly predicted on the basis of sequencing. 
Nevertheless, the results provide an interest- 
ing finding in that when the subject perceives 
that he is able, through some modicum of per- 
sonal activity, to predict the events occurring in 
agiven situation, he becomes more accurate in 
his perception of changes in that situation. 
The lack of prediction of trials to extinction 
Suggests that subjects under long sequencing 
probably interpreted the extinction series as 
just another long sequence, and still attempted 
to predict its termination, consequently mak- 
ing occasional “errors.” 

Phares (1962) reported a second experi- 
ment in which skill versus chance directions 
were used to differentially affect the per- 
ceptual thresholds for nonsense syllables, 
half of which were paired with shock. The 
experiment was designed to test the hypo- 
thesis that when escape from a painful stim- 
lus is possible only on a chance basis, the 
difference between pre- and postexperiment 
Tecognition thresholds for shock-associated 
stimuli will be smaller than in a skill 
situation where escape depends on the sub- 
Jects’ ability to perceive the same stimuli. 

experimental design, excepting the per- 
‘eptual-response variable, was very similar to 
Mowrer’s (1948) and Richter’s (1959) ex- 
Periments, described earlier. All involved 
avoidance of painful stimuli with interest 
cused on the effects of controllability of 
th escape. Phares based his predictions on 
€ rationale that an expectancy of control in 
t shock situation would lead the subject 
a -iah a maner most likely to capitalize 
. US ability to control the situation, which 
S experiment consisted of lowering 

olds of recognition. Phares’ subjects 


in the internal-control conditions learned 
to press buttons associated with given shock- 
related syllables to terminate shock. Subjects 
under chance conditions, on the other hand, 
were told that the correct button for terminat- 
ing shock changed continuously so that escape 
occurred only on a random basis. As in the 
Mowrer and Viek (1948) experiment, chance 
subjects were matched with skill subjects 
with respect to number of escapes and syl- 
lables on which escape occurred. The results 
indicated that threshold decrements were sig- 
nificantly greater for skill than chance sub- 
jects for both shock- and nonshock-related 
syllables, Phares included one control group 
in his design which received no shock so that 
their behavior as indicated by the pre-post 
measures was not instrumental for pain re- 
lease. The nonshock control group performed 
remarkably like the skill-shock group. Both 
groups differed significantly from the chance 
or external-control group. This latter finding 
is very interesting in light of Mowrer's de- 
scription of his shock-controlling rats. Those 
rats who could control the shock demon- 
strated no fear and acted almost “nonchalant” 
in face of the painful stimulus. No inter- 
ference with activities such as eating was 
found. Likewise Richter’s “hopeful” rats 
(Richter, 1959) resumed what appeared to 
be normal, vigorous responding as soon as 
anticipation of effective escape was restored. 
The similarity between humans and rats in 
their nondisturbance with pain when control 
of that pain is possible suggests that the con- 
trol dimension may have relevance to a wide 
range of human and infrahuman responses. 
At this point, it might be relevant to cite 
two research findings with animals which in- 
dicate the importance of locus of control for 
predicting differential responses to the same 
stimuli. The Walter Reed Army Institute of 
Research group (Brady, 1958; Brady, Porter, 
Conrad, & Mason, 1958) has reported 
studies concerning the development of ulcers 
in rhesus monkeys. Although these writers 
have become more involved in the effects of 
sequence and duration of trials as determi- 
nants of ulcer formation, one remarkable find- 
ing they reported was that only monkeys who 
exerted control over a painful stimulus de- 
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veloped ulcers, while their partners who were 
linked in series connections and passively re- 
ceived the same shock failed to develop ulcers, 
In these experiments, the production of ulcers 
seems to be related to having control over 
aversive stimulation. An animal experiment 
reported by Malmo (1963) described a dif- 
ferential response to extinction of septal 
stimulation by rats in classical versus 
instrumental conditioning situations. Malmo 
explained the phenomena as follows: 


In the passive (classical conditioning) situation it 
were as though “hope” was not aroused by the 
tone (CS), or, if it was aroused, that “anger” failed 
to occur, when in extinction the tone was not fol- 
lowed by septal stimulation. Anthropomorphically 
considered, it were as if having exerted no effort to 
obtain the septal stimulation the animal had not 
earned the right to object to its omission! Seriously 
though, it would appear that the animal’s bar press- 
ing response in producing the brain stimulation (and 
the proprioceptive feedback) was essential for the 
appearance of the expectancy frustration phenomena 
[p. 19]. 


Briefly, Malmo’s rats acted “angry” and 
fitful when responses previously reinforced 
in instrumental conditioning ceased to evoke 
the reinforcement, while the same reinforce- 
ment (CS-UCS linkage) in classical con- 
ditioning with subsequent extinction led to no 
such reaction. 

Although there is no ready explanation for 
these findings, they seem relevant enough to 
the human and animal work reported above to 
suggest that the locus-of-control variable 
may have implications for a wider spectrum of 
problems and species than previously believed. 

The remainder of this paper presents a 
review of the literature dealing with predic- 
tions of individual differences from measures 
of internal-external control as a generalized 
expectancy. 


Internal-External Control as an Intrapersonal 
Variable 


The first attempt to measure the internal- 
external control dimension as a personality 
variable in social learning theory was re- 
ported in a doctoral dissertation by Phares 
(1955). Phares designed a 13-item scale to 
measure a general attitude or personality 
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characteristic of attributing the occur 
reinforcements to chance rather than g y 
Within groups receiving skill versus œ 
directions for color- and line-matching 
he found some low-level predictions Ø 
quency of shifting and unusual shifts 
scale of chance orientation. 

With a more lengthy revision of the P 
scale, James (1957) found a significant 
relation between the James-Phares 
type scale and the Incomplete Sentences Bl 
personal adjustment score (Rotter, 195 
The relationship appeared to be curvi 
extreme internals and extreme extern 
pearing less adjusted. In two subs 
master’s theses (Holden, 1958: Simmo 
1959) additional correlates of the Jar 
Phares scale were investigated. Briefly, t 
James-Phares scale was found to correla 
(r = .51, N = 101, with intelligence pa tiak 
out) with the California F Scale which ¥ 
interpreted as reflecting the sucessful measur 
ment in both scales of the degree to whi 
individuals see the world as containing powe 
ful forces that they cannot influence. 
ondly, behavior on the Level of Aspir 10 
Board (Rotter, 1954) was related to th 
James-Phares scale. For clarity of present 
tion, the description of the subjects on th 
internal-external control dimension will | 
discussed in terms of degree of extern 
Highly external subjects shifted estimat 
frequently, apparently unable to arrive at 
stable evaluation of their own skill. Patteri 
derived from Level-of-Aspiration performant 
(Rotter, 1954) which indicate cautious 
fensive or failure-avoidant strategies st 
more characteristic of highly external p 
on the James-Phares scale, while the mon 
aggressive, success-striving patterns s 
more common to those scoring low in 
ternality on the James-Phares scale. 


?It has been posited by critics of the F 5 
that acquiescence or agreeing response teni 
account for a considerable proportion of the res 
obtained with the F scale. However, a recent 
by Rorer (1965) has demonstrated that respons 
styles are inadequate as alternate interpretations 
data derived from measures such as the F SC 
There appears to be less consistency betwee® 
measures of response styles than between attitude 
measures of authoritarianism, 3 


. 
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Since the presentation of the James-Phares 
scale, a series of new scales have been utilized, 
some designed for testing special age groups. 
The Internal-External Control Scale is a 
forced-choice-type measure offering alterna- 
tives between internal- and external-control 
interpretations of various events (Rotter, 
Seeman, & Liverant, 1962). A monograph 
providing extensive data on the development, 
validity, and reliability of the Internal-Ex- 
ternal Control Scale is currently in press 
(Rotter, 1966). The Locus of Control Scale 
for children is an orally administered true- 
false scale (Bialer, 1961); The Childrens’ 
Picture Test of Internal-External Control 
presents a series of cartoons about which a 
child states “what he would say” in the 
depicted lifelike situations which involve at- 
tribution of responsibility (Battle & Rotter, 
1963); The Intellectual Achievement Re- 
sponsibility Questionnaire contains forced- 
choice items for children pairing an internal 
and external interpretation of achievement 
outcomes. This scale provides for possible 
differences between responsibility attribution 
for failure and success outcomes (Crandall, 
Katkovsky, & Crandall, 1964); The Power- 
lessness and Normlessness Scales contains 
Likert-type scales derived from sociological 
Studies of alienation (Dean, 1961). 

One set of research findings with the con- 
trol dimension involves the prediction of ex- 
‘emality in known ethnic groups. With the 
assumption that Negroes in the United States 
“an easily perceive impediments in the way 
of goal striving, several studies have sucess- 
fully Predicted greater externality among 

©gtoes than among whites. Battle and Rotter 
(1963) found an interaction between race and 
“cial class on the control variable as meas- 
ured by a projective device called The Chil- 


drens Picture Test of Internal-External Con- 


» Lower-class Negroes were significantly 

x external than lower-class whites or 

5 dle-class Negroes and whites. In addition, 
ghly external children reported significantly 

w mean expectancies for success on a line- 
atching test, For comparative purposes, the 

aer Locus of Control Scale was compared 

the Picture Test for 40 subjects and 


Co) aes 
"elated significantly with it (r = — 42, $ 


<.01). A high score on the Bialer scale 
indicates low externality while a high score 
on the Picture Test indicates high externality. 
Since the two measurement devices differ so 
greatly in format and the sample-size used 
was small, the relationship between them 
may be more significant than the obtained 
magnitude suggests. The Bialer scale also 
related significantly to the number of unusual 
shifts made in expectancy statements in the 
line-matching task (r = — 47, p< 01, N = 
40). Highly external subjects raised their 
expectancies after failure and lowered them 
after success more often than subjects low in 
externality. 

Using a similar argument—that racial 
segregation and discrimination means to Ne- 
groes that their own efforts will lead to no 
reinforcements unless adventitious circum- 
stances make it so—Lefcourt and Ladwig 
(1965a, 1966) successfully predicted higher 
external-control expectancies among Negro 
than among white prison inmates (most 
of whom were from low socioeconomic back- 
grounds) on six different measures: the In- 
ternal-External Control Scale, Dean's Power- 
lessness and Normlessness Scales, and three 
indices derived from performance on the Level 
of Aspiration Board—number of shifts, num- 
ber of unusual shifts, and patterns. Negroes 
scored significantly higher in externality on 
the three scales and performed in ways in- 
terpreted as reflecting external control on the 
Level of Aspiration indices. In a comparison 
of the reformatory samples with a normative 
population on the Powerlessness and Norm- 
lessness Scales, Negro inmates scored sig- 
nificantly higher on the powerlessness vari- 
able. White inmates failed to differ ai the 
normative population on the powerlessness 
measure though scoring higher in normless- 
ness. (Powerlessness refers to the lack of 
power to cause ends and is more similar to 
the control construct. Normlessness refers to 
the belief that conventionally-approved path- 
ways can not be used effectively to attain 
desired ends.) a fyi: 

In a third ethnic-group investigation, 
Graves (1961) and Jessor adapted the In- 
ternal-External Control Scale for high school 
students and studied ethnic differences in an 
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isolated tri-ethnic community. They found 
whites to be least external, followed by 
Spanish-Americans. Indians were the most 
external in attitudes, These findings were 
consistent with predictions about the groups. 
Although economic factors undoubtedly con- 
tributed to differences, Graves felt that “eth- 
nicity” was an important source of variance 
after other factors were controlled. 

From a different orientation, Strodtbeck 
(1958) has discussed a construct called 
“mastery” and presents research relating 
religious, national, and social-class orienta- 
tions within families to the development of 
mastery. Strodtbeck’s scale of mastery seems 
very similar to the control dimension stressing 
effectance belief. Strodtbeck found Jewish 
middle- and upper-class subjects more mastery 
believing than lower-class Italians. Most of 
the variance was attributable to social class. 

Using subjects enrolled in a southern Negro 
college, Gore and Rotter (1963) found that 
the Internal-External Control Scale predicted 
the type and degree of commitment behavior 
manifested to effect social change. Those sub- 
jects scoring lowest in externality signed 
Statements expressing the greatest amount of 
interest in social action (the March on Wash- 
ington and forming a freedom riders group) 
while the more external subjects either ex- 
pressed no interest in participation or minimal 
involvement (willingness to attend a rally). 
This study has since been replicated with 
nearly identical results (Strickland, 1965). 

In all of the reported ethnic studies, groups 
whose social position is one of minimal power 
either by class or race tend to score higher 
in the external-control direction. Within the 
racial groupings, class interacts so that the 
double handicap of lower-class and “lower- 
caste” seems to produce persons with the 
highest expectancy of external control, Per- 
haps the apathy and what is often described 
as lower-class lack of motivation to achieve 
may be explained as a result of the disbelief 
that effort pays off. In short, the “oppressed” 
groups can be described as analogous to 
Mowrer’s rats whose “fear of fear” led to 
nonsurvival behavior. Bettelheim (1952) 
discussed an analogous accommodation to de- 
creased opportunity in Nazi concentration 
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camps. He found that prisoners ceased to be 
active and responsible “subjects” and be 
came passive, irresponsible, and childlike “g 
jects” under such oppressive conditions, 

Two other studies have concerned group 
differences in perceived control with reference 
to pathological populations. Bialer (1961) 
administered a Locus of Control Scale orally 
to retarded and normal children (mean ages 
of 10 years 4 months and 10 years, respec 
tively). Rather than compare the two groups, 
Bialer combined them into one large sample 
and sought intercorrelations among mental 
age using the Peabody Picture Vocabulary 
Test (Dunn, 1959), locus of control, prefer- 
ence for return to completed versus inter- 
rupted tasks, and gratification patterns (int 
mediate vs. delayed-reinforcement preference). 
Bialer predicted that all of these variables 
would be interrelated, reflecting “conceptual 
maturity.” Contrasting the relationships of 
chronological and mental age, he found that 
mental age accounted for most of the variance 
involved among three criteria variables. With 
an N of 89, locus of control (the higher the 
score, the lower the externality) correlated 
positively with mental age (r= .56) 
with deferred gratification preference (r= 
47), both significant at the .01 level. An 
obtained relationship between locus of coni 
and chronological age (r = .37, p < -01) was 
minimized when mental age was partialed out 
(partial r = .02). On the other hand, mental 
age and locus of control remained strongly 
related (r=.47) with chronological ag 
partialed out. 

In a study comparing schizophrenics and 
normals, Cromwell, Rosenthal, Shakow, 
Kahn (1961) used the James-Phares scale, @" 
early form of the Internal-External Con! 
Scale, and the Bialer-Cromwell Locus of Con- 
trol Scale. On all three measures, they found 
schizophrenics to be significantly higher ™ 
externality than normals. In addition, they 
were interested in investigating the differen 
effect of autonomous (internal) and contro 
(external) conditions upon reaction tim 
(RT). Conflicting findings about schizo 
phrenic? RT deficiencies had promp 
Shakow (1950) to note that some of the RT 
tasks differ in the degree to which they allow 


the subject autonomy for initiating his re- 
sponse. Cromwell et al. conducted an RT 
experiment with their subjects under four 
conditions, two of which were self-directing, 
and two of which differed in degrees of ex- 
ternal control. They found that normals did 
better (lower RT) in and preferred situations 
allowing autonomy, while schizophrenics did 
better in and preferred externally controlled 
situations. This was interpreted as reflecting 
the external-control schizophrenics’ distress at 
decision-making in autonomous conditions. 
Within the normal sample, the James-Phares 
sale correlated significantly with RT superi- 
rity in autonomous situations (r = .74, p < 
05, N = 13). The greater the tendency of 
normal subjects to answer in the direction of 
external control on the James-Phares scale, 
the less they improved their performance in 
_ the autonomous conditions over and above 
that in the controlled conditions. While 
Superiority of performance in autonomous 
conditions correlated with the other scales, 
(highly external subjects performing less ad- 
tquately with autonomous conditions) the 
_ Telationships fell short of significance due to 
the small sample size though all were in the 
Same direction. Within the schizophrenic 
group, however, correlations all approached 
zero. The lack of relationship in the latter 
group may be accounted for by the small 
Variance in and extremity of external con- 
trol in that group. 
The remaining studies reported in this 
Paper concern specific behavioral correlates of 
control dimension rather than group dif- 
nces on that dimension. 


Learning and Achievement 


Since the control dimension is usually 
Measured by scales stressing academic in- 
lerests, it would seem likely that learning 
aS and achievement behaviors would be 
Y related to control. Early grade school 

$ a differed according to sex on the Intel- 
a al Achievement Responsibility Question- 
“Te (Crandall, Katkovsky, & Crandall, 1965), 
ng more prone to assign responsibility 
ess for results eventuating from 
®ctual achievement efforts (Crandall, 
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Katkovsky, & Preston, 1962). In the same 
investigation, the Intellectual Achievement 
Responsibility Questionnaire and other meas- 
ures were compared with four achieve- 
ment-related activities (time spent in intel- 
lectual free-play activities, intensity of striv- 
ing in intellectual free-play pursuits, intel- 
ligence test performance, and reading and 
arithmetic test performances). Briefly, re- 
sponsibility attribution was significantly re- 
lated to most criteria for males but not for 
females. Male subjects who attributed achieve- 
ment responsibility to themselves spent more 
time in intellectual free-play activities (r = 
.70, p < .05, N = 20), demonstrated greater 
intensity of striving in intellectual free-play 
pursuits (r = .66, p < .05), scored higher on 
intelligence tests (Stanford-Binet, r= .52, 
p < .05), on reading achievement tests (7 = 
51, p<.05), and arithmetic achievement 
tests (r= .38, p < .10). In the same in- 
vestigation, a TAT measure of need for 
achievement failed to relate to any of the 
criterion situations. 

In two studies deriving from a sociological 
emphasis on alienation, Seeman (Seeman, 
1963; Seeman & Evans, 1962) has reported 
differential learning between internals (low 
alienated) and externals (high alienated) in 
two field settings. With groups matched on 
socioeconomic and hospital-experience vari- 
ables, Seeman and Evans found that hospital- 
ized tuberculosis patients characterized as ex- 
ternal controls had less objective knowledge 
about their own conditions. This differential 
knowledge about health matters was evidently 
revealed in their ward behavior, as indicated 
by the fact that multiple and independent 
staff describers of the patients were in agree- 
ment concerning the low information pos- 
sessed by the more external-control patients. 
The scale used for differentiating low and 
high externals was a shortened version of the 
Internal-External Control Scale. In the second 
investigation (Seeman, 1963), an attempt was 
made to control for intelligence and the 
novelty of the stimulus materials to be 
learned. Seeman presented materials related 
to correctional matters to a sample of re- 
formatory inmates. Three kinds of informa- 
tion, differing chiefly in immediate relevance 
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to inmate attempts to control important goals, 
were presented to the prisoners. This informa- 
tion concerned (a) the present reformatory 
setting, (6) factors related to achieving suc- 
cessful parole, and (c) long-range prospects 
for a noncriminal career. The essential pre- 
diction was that inmates scoring low in ex- 
ternality would show superior retention of 
the parole material, since this material most 
clearly implies the possibility and value of 
personal control. The findings demonstrated 
no differences in learning the materials in the 
first and last categories above. However, 
inmates low in externality learned the parole- 
related material significantly better than in- 
mates high in externality. When inmates were 
divided into “Square Johns” (inmates who 
had earned merits demonstrating conformity 
to institution demands) and “Real Cons” 
(those with no merits) the control measure 
showed significant prediction of parole-knowl- 
edge learning only in the “Square John” 
group. The “unconventional” inmates demon- 
strated no relationship between alienation and 
learning. Among those committed to habilita- 
tion values, expectation of personal control 
was significantly related to learning of rele- 
vant information. Again no differences were 
found in the learning of information that was 
not relevant to personal control of important 
goals. These findings indicate the importance 
of values as well as expectancies for making 
differential predictions of learning. For con- 
trol purposes, Seeman also administered the 
Marlowe-Crowne Social Desirability Scale 
(Crowne & Marlowe, 1960). The social de- 
sirability measure was found to have no rela- 
tionship with the criteria. 

In summary, in investigations concerned 
with learning and achievement-related vari- 
ables, the control construct allows some pre- 
diction when the materials are relevant to 
the subjects’ goal strivings. However, success- 
ful predictions in this area were found only 
in male samples. The one study that included 
female subjects (Crandall et al. 1962) re- 
vealed no relationship between perceived con- 
trol and achievement behaviors for girls (cor- 
relations averaged around zero). Perceived 
control, as need achievement, may be less 
useful for predicting females’ achievement be- 


H. M. LEFCOURT 


haviors than it is for males’. More investige 
tions including sex as a variable are necessary. 

The remainder of the studies to be reported 
do not fall into any convenient cluster for 
reporting purposes and are presented a 
miscellaneous investigations. 


Conformity 


In a study concerning personality char 
acteristics of conformers (Odell, 1959), a 
significant relationship was found between the 
Internal-External Control Scale and Barron's 
(1953) Independence of Judgment Scale, with 
subjects high in externality showing greater 
tendencies to conform. As part of a larger 
investigation concerning conformity, Crowne 
and Liverant (1963) reported supporting evi- 
dence for Odell’s findings. When subjects in 
Asch-type conformity situations had to make 
bets concerning their accuracy, subjects high 
in externality were found to conform sig- 
nificantly more than subjects low in e 
ternality (as measured by the Internal-Ex- 
ternal Control Scale). Additionally, with con- 
fidence in outcomes expressed in terms of 
amount wagered, highly external subjects 
tended to be less confident than low-external 
subjects. Differences in relative amounts bet 
on “conforming” and “independent” trials 
were also found between low- and high-ex- 
ternal subjects. Low externals bet approx! 
mately the same on both conforming an 
independent trials, while high externals bet 
significantly less on independent trials that 
on trials in which they yielded (¢ = 2.68; 
$ < .02). Also, the greatest differentiation 
between low- and high-external subjects m 
amounts bet occurred on independent trials, 
low externals betting more than high-external 
subjects. 

Crowne and Liverant also used Level-of- 
Aspiration patterns (Rotter, 1954) for the 
prediction of conformity. As described in the 
Lefcourt and Ladwig (1965a) study, pat 
terns “one” and “three” represent the low 
externality patterns. These patterns are char- 
acterized by average to moderately high mea” 
difference scores with an average number of 
shifts, and no or rare unusual shifts. Crow"® 
and Liverant describe patterns one a 


as indicative of confidence and an 
ement or success orientation, while pat- 
four” and “seven” represent a more fail- 
pidant, defensive, goal-setting behavior. 
patterns are characterized by either 
continuous shifting with each outcome or by 
an excessively high negative difference score 

age with unusual shifting, primarily 
downward after success. The results compar- 
ing Level-of-Aspiration performance and con- 
formity are included here because of their 
previously explored relationship with the con- 
tol dimension (Lefcourt & Ladwig, 1965a; 
Simmons, 1959). Failure-avoidant groups 
@onformed more than subjects with achieve- 
"Ment patterns and tended to be less confident, 
s expressed in expectancies and betting, 
‘than the achievement-oriented groups, though 
the difference failed to reach statistical signifi- 
Crowne and Liverant interpreted their 
is as portraying the conformer as one 
has low expectancies of success in socially 
ative situations, as reflected in a high 
control or defensive Level-of-Aspira- 
ù pattern. 


vi ant and Scodel (1960) hypothesized 
subjects low in externality would be- 
that they could exert a modicum of 


high in externality would view out- 
in such situations as occurring ran- 
y. Subjects were engaged in a risk-taking 
ation in which they were required to bet 
Outcome of 30 trials of dice throwing. 
cts had to select amounts to bet, as well 
one of seven alternative bets with 
Objective probabilities. Liverant and 
Predicted that low-external subjects 
d select more high-probability, low-pay- 
ets than high-external subjects. They 
that low externals chose significantly 
bets of intermediate probability and 
ntly fewer low-probability bets than 
h-external subjects. Also, more low 
eS than high externals never selected 
a teme high- or low-probability bet. Low 
als wagered more money on cautious 
sky bets. In short, perceived control 
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was found to differentiate behavior in the 
risk-taking situation, low externals revealing 
a greater tendency toward self-regulation with 
regard to objective probabilities. 

In a near replication of the Liverant and 
Scodel study, Lefcourt (1965) compared the 
risk-taking behavior of Negroes and whites 
whose behavior had reflected high-external 
and low-external-control orientations, respec- 
tively, in previous experimentation in skilled 
tasks (Lefcourt & Ladwig, 1965a). On the 
assumption that a chance task would elicit 
less defensiveness or failure avoidance than a 
skill task for Negroes, it was predicted that 
Negroes would prove less external than 
whites in a chance situation. Using the same 
task and the same indices as Liverant and 
Scodel, Negroes were found to choose less 
low-probability bets, and were generally less 
risk-taking than whites. This reversal of in- 
ternal-control reflecting behavior in skill ver- 
sus chance situations was interpreted as being 
due to Negroes’ disbelief that achievement in 
self-evaluative, skill-demanding tasks is con- 
trollable. Success in externally controlled 
situations (luck- or fate-determined) seems 
more controllable for the Negro who believes 
that goals derived through achievement will 
be denied him regardless of his effort, while 
externally controlled goals are, at least, ob- 
tained fairly. 


Further Correlates of the Control Dimension 


In an investigation of strategy preference, 
Lichtman and Julian (1964) had subjects 
estimate their performance at each of a num- 
ber of distances in throwing 12 darts at a 
target. After having established the distance 
from which the subject judged he could score 
with five and seven darts respectively, the 
subject was asked to choose the distance 
from which he would prefer to throw, given 
the conditions that at the closer distance he 
would be provided with only five darts while 
at the farther position he would receive 
seven. Consequently, the conditional prob- 
abilities of success were equated at the two 
distances though they differed in the degree 
of actual control that the subject could prob- 


ably exert over the outcome. With the In- 
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ternal-External Scale, Lichtman and Julian 
found a significant difference between low- 
external and high-external subjects in choice 
of position, subjects low in externality more 
often choosing the closer distance, while sub- 
jects high in externality preferred the farther 
distance at a 4:1 ratio. This finding parallels 
that of Liverant and Scodel (1960) in that 
low-external subjects prefer the high-prob- 
ability choices through which to maximize 
their successes. An additional finding in the 
Lichtman and Julian study is a reported cor- 
relation with the Marlowe-Crowne Social De- 
sirability Scale (1960) (r= —.39, p < .05, 
N = 28), and an insignificant relationship 
(r = —.27, N = 28) with a measure of need 
for achievement. The former finding indicates 
a weak but significant tendency for persons 
with high need for approval to be doubtful 
about their personal efficacy. This relation- 
ship has also been reported in a study by 
Strickland and Rodwan (1963). However, 
several other studies (Crowne & Liverant, 
1963; Seeman, 1963) have reported no rela- 
tionship between social desirability and con- 
trol measures. The statistically insignificant 
but negative relationship between need 
achievement and external control supports a 
previously reported finding by Odell (1959) 
in which need achievement and externality 
were significantly related (r= —.25, p< 
05). In the Odell study, a larger sample al- 
lowed for greater statistical significance. How- 
ever, the magnitude of the relationship seems 
fairly consistent. The need achievement meas- 
ure used by Odell derives from TAT-like stim- 
uli whereas the Lichtman and Julian study 
utilized the French (1958) method. The simi- 
larity of results despite different techniques 
used argues for the stability of this modest 
relationship. Theoretically, one would expect 
internal-control persons to demonstrate the 
search for mastery that need achievement 
defines, 

Two other correlates of interest have been 
reported by Butterfield (1964). In an ex- 
tensive correlational study, Butterfield found 
strong correlations among the Internal-Ex- 
ternal Control Scale and The Child and 
Waterhouse (1953) Frustration-Reaction In- 
ventory, and the Alpert-Haber Facilitating- 
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Debilitating Test Anxiety 
(1960). Responses to the Child 
house measure are categorized 
groups: constructive response to fi 
intropunitive response, and extrapy 
sponse to frustration. Butterfield 
and found a relationship between 
control and constructive response to ff 
tion scores (r= —.37, p< .02, À 
Partial correlations between each f 
reaction score and external control 
of the other two reaction scores i 
constant were r= .57, p < .01 bet 
ternality and intropunitive, and r=: 
< .01 for constructive reactions. 
ceived locus of control became more 
constructive responses decreased, an 
punitive responses increased when 
response scores were partialed out. N 
tionship with extrapunitive onses 
found. This finding indicates that U 
external individual claims that he rei 
a more problem-solving direction dé 
frustration, wasting less time on @ 
rumination and self-accusatory gestul 
detract from problem-solving efforts, 
In regard to the Alpert-Haber measure; 
scores are derived—a debilitating-anxiety 
a facilitating-anxiety measure. The 


while the latter measure produced @ 
—.68, p< .01 with external-control 5 
Facilitating anxiety correlated with const 
tive response to frustration (Child-Ws 
house measure), r = .49, p < .01. Other 
scales of the two measures were uni 
Partial correlations between each an 
score and external control, with the 
anxiety type held constant, were presé 
The results were as marked as with the 
tration-response measure. External © 
correlated (r= .61, p < .01) with debili 
ing-anxiety scores when facilitating nx 
scores were partialed out; external com 
correlated with facilitating anxiety (r = — 
$ < .01) with debilitating-anxiety scores i 
constant. Facilitating Anxiety decreased f 
debilitating anxiety increased as locus of í 
trol became more external. : 
Since both the frustration and ansi 
indices are self-report measures, the fin 
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i dpt themselves as goal-directed workers 


who strive to overcome hardships, whereas 
Mgbexternal subjects portray themselves as 
mfering, anxious, and less concerned with 
mblevement per se than with their affect re- 
onses to failure. The high multiple cor- 
telation (R =.81), which does not differ 
iignificantly from one approaching unity, be- 
tween locus of control and the facilitating- 
sad debilitating-anxiety scores raises a ques- 
thee about the independence of these scales. 
The items and format of the scales are suffi- 
Gently different that one would not anticipate 
sch relationships as were obtained. How- 
ever, the facilitating-debilitating anxiety scale 
dees concern responses to achievement situa- 
tions, as do several items in the locus-of-con- 
trol scale. Consequently, the strong relation- 
thips obtained may indicate the greater suc- 
cess expectancies in achievement situations of 
the less-external-control individuals. It should 
be noted that the Alpert-Haber anxiety meas- 
tre derives from work concerning test anxiety 
(Mandler & Sarason, 1952), the measure of 
Which has been used as an index of fear or 
“pectation of failure in achievement situa- 
tions by Atkinson and Litwin (1960). 


Summary and Conclusions 


Research findings from experiments manip- 
apparent controllability and investiga- 

tions using measures of locus of control to 
sake differential predictions of control-related 
ts have been reported. It can be con- 

that perceived control is a useful 
variable, and, in relation to the types of ex- 
ts noted in the introductory section, 

"ay be related to problems such as psycho- 
y, apathy, and withdrawal phenom- 

“a. It is no mean coincidence in time that 
Such as Piaget (Flavell, 1963) and 
notte (1963) have also been doing ex- 
research into the causal relationships 
ad Western man imposes upon his world; 
( that psychotherapists such as Adler 
Ansbacher & Ansbacher, 1956) have con- 
themselves with man’s development of 

~ White’s concern with effectance and 

tence mirrors this same focus of interest 
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in personality psychology. As indicated ia the 
range of studies described, this concern with 
control is amenable to research, though more 
investigation is needed. 

At a time when problems of response sets 
and response styles beset any investigator 


predominan: 
retical orientation, has involved the uie of 
several different measurement techniques. 
Forced-choice, Likert-type scales, true-false 


tasks have all been utilized and have demon- 
strated some efficacy in predicting different 
criteria related to the locus-of-control dimen- 
sion. The success of a variety of techniques in 
measuring the control dimension provides sup- 
port for the construct validity of that dimen- 
sion and argues against a response-style inter- 
pretation of scale performance. 

Insofar as response set is concerned, several 
investigators have reported correlations be- 
tween perceived control and social desirability. 
Overall results indicate that the relationship 
vile Trat s Oe, ar eee 
significance. With but a proportion o 
the variance accounted for in the relation- 
ship between control and social desirability, 
the response-set interpretation of control- 
scale data appears nondefensible. In addition, 
social desirability has proven ineffective in 
the prediction of criteria related to the con- 
trol dimension in several investigations. 

Another question may be raised regarding 
the relationship between the control dimension 
and in As indicated in two studies 
(Bialer, 1961; Crandall, Katkovsky, & Pres- 
ton, 1962) intelligence is positively related to 
perceived internal control. Cromwell (1963) 


intelligence. If such were the case, then a 
measure of intelligence would perhaps be 
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preferable to the attitude scales used in 
measuring the control variable, However, in 
studies where the range of intelligence is not 
as extensive as in the above-mentioned in- 
vestigations (Bialer used retardates in his 
study), little relationship has been found 
between intelligence and control measures. In 
fact, one investigation (Battle & Rotter, 
1963) reported a reversal—lower-class Ne- 
groes with high IQs being more external than 
middle-class whites with lower IQs. 

Two questions of immediate interest that 
have not been investigated in any depth con- 
cern the origins and sources of control orienta- 
tions and the operations for altering such 
orientations, Pertinent to the origins problem 
is the study by Strodtbeck (1958) in which 
attitudes similar to those of internal-control 
expectancies were investigated relative to fam- 
ily structure. Relevant to expectancy altera- 
tion, one study by Lefcourt and Ladwig 
(1965b) sought to vary expectancy by a 
“reference group manipulation.” In this study, 
Negroes who had previously been character- 
ized as highly external were led to believe that 
they were being studied as jazz musicians. In 
a game situation, the usually high external- 
control Negroes persisted in competition 
against a white opponent despite continuous 
losses when they believed that the experi- 
menter was interested in them as jazz musi- 
cians. Two control groups (a second jazz musi- 
cian group for whom jazz cues were irrelevant, 
and a nonmusician group) failed to show 
the same persistence. In this experiment 
where external-control orientations should pre- 
dict failure-avoidance (quitting the experi- 
ment), Negroes continued to meet competitive 
challenges if they maintained expectancies 
other than those for themselves as Negroes. 

Despite the leads from these research re- 
ports, little has been reported on how internal- 
or external-control expectancies become gen- 
eralized across differing situations. Work 
needs to be done on specific antecedents of 
internal- and external-control orientations, 
and on the factors leading to the generaliza- 
tion of these orientations. In addition, the 
breakdown of external-control expectancies 
assumes more than a theoretical interest when 
programs are currently being devised by gov- 
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ernmental agencies seeking to ameliorgs 
problems of poverty and racial barriers, the 
very problems which seem to generate œ 
ternal-control orientations and their cme 
comitants of apathy and lack of goal-striving 
behavior. 
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` research paradigm is Introduced for investigating the 


eson learns to predict the behavior 


rived from Brunswik's probabilistic functionalism and his 
Lcbavior. Methods of analysis are applied to 
experiment. Results of the experiment show that 
the results are also shown to have implications 


studies of interpersonal perception. 


The concept of interpersonal learning is 
introduced here as the process whereby one 
person learns to predict the responses of 
another person to a variety of situations. It 
is a process hardly touched so far by studies 
= within the framework of traditional studies of 

karning or by studies of interpersonal per- 


ception. 


LEARNING, INTERPERSONAL PERCEPTION, AND 
INTERPERSONAL LEARNING 
Learning 
Traditional studies of learning have ignored 
this aspect of human behavior because they 
have been directed toward the organism’s 
ability to cope with physical stimulus ar- 
rangements or verbal material; such ar- 
rangements being used because of the ease 
with which such stimuli may be identified, 
Quantified, and manipulated. Although cer- 
fain studies of learning have included hu- 
Mans as part of the stimulus array (e.g., 
Bandura & Walters, 1963; Miller & Dollard, 
1941; Suppes & Krasne, 1961) the human 
vior in these studies is treated merely 
è a cue to a physical state of affairs to be 
ed about; in studies of interpersonal 
g, however, it is the interaction be- 
te me research reported here was undertaken in 
vior Research Laboratory, Institute of Be- 
Publicati Science, University of Colorado and is 
n on No. 67 of the Institute. This 
tage by a research grant (GS-266) from 
know ational Science Foundation. Grateful ac- 
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of another person. 
data provided by 


| 


per- 


ality 
(e.g., Cline & 
mond, 1957). However, 
with veridical achievement have encountered 
severe methodological difficulties, and, as a 
result, many researchers turned to the in- 
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vestigation of processes involved in inter- 
personal perception (Tagiuri & Petrullo, 
1958). 

Interest in process is shown by studies of 
person-cognition (see Beach & Wertheimer, 
1961; Newcomb, 1958). Research in this 
area ‘has been concerned with the cognitive 
complexity of the perceiver (Bieri & Blacker, 
1956; May & Crockett, 1964), the general 
cognitive organization of impressions (e.g., 
Todd & Rappoport, 1964; Triandis & Fish- 


TRADITIONAL LEARNING 
STUDIES 


Environment «<————_—_——- $ 
(growth curves) 


(person-object, if present, 
incidental) 
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bein, 1963; Wishner, 1960), and the naie 
psychology or implicit personal theory of the 
perceiver (e.g., Heider, 1958; Jones, Davis 
& Gergen, 1961). Although these studies ip 
dicate a greater concern with cognitive proc- 
esses, as yet there have been no experimental 
studies which analyze the subject’s ability te 
learn to predict the interaction of the person- 
object with a specific environment, nor have 
there been analyses of the limits of such leam- 
ing. Clearly, the rates and limits of learning 


TRADITIONAL INTER-PERSONAL 
PERCEPTION STUDIES 


Person - Object <—____— S 
(asymptotes) 


(physical task, if present, 


incidental) 


INTER - PERSONAL LEARNING 


l Environment ~<————______ Person - Object | I 


(growth curves and asymptotes) 


(concepts and method focused equally on environment, person-object, 


and subject) 


(person-object and physical task given equal role in experiment) 


Fic. 1, Similarities and differences between traditional learning studies, interpersonal per- 
ception studies, and interpersonal learning. 
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are the essential characteristics of this funda- 
mental psychological process. 


Distinction between Learning Studies, Inter- 
personal Perception, and Interpersonal 
Learning 


Any effort to analyze the processes in- 
volved in learning about the other person 
will have to take into full consideration the 
environmental determinants of the other’s 
behavior since the other’s behavior does not 
take place in an environmental vacuum. Thus, 
the study of both growth curves and asymp- 
totic levels of interpersonal learning must 
provide methods for (a) environmental analy- 
sis, (b) analysis of the person-object’s re- 
sponse system in relation to the environment 
and (c) analysis of the learner’s cognitive 
processes concerning both environment and 
person-object. Neither traditional learning 
studies nor traditional person-perception 
studies have provided a research paradigm 
which would make such analyses possible. 
The distinction between traditional studies 
and interpersonal learning is illustrated in 
Figure 1. 

In short, traditional studies of learning 
and person-perception (and/or person-cogni- 
tion) have not seriously engaged the problem 
of interpersonal learning. Yet, not only is it 
perfectly apparent that humans do learn 
about one another more or less successfully, 
but it is equally apparent that such knowl- 
edge forms a cornerstone of human society. 
It is hard to escape the conclusion that the 
Vigorous pursuit of the study of interper- 
sonal learning is long overdue. The purpose 
of this paper is to describe a research para- 
digm which makes it possible to study inter- 
Personal learning. 


RESEARCH PARADIGM AND PROBABILISTIC 
FUNCTIONALISM 


i The general research paradigm for study- 
ng interpersonal learning to be described 
ren was derived from Brunswik’s proba- 
bas functionalism (1952, 1956); the spe- 
se Point of departure is Brunswik’s (1955) 
a model of behavior. An illustration of the 
darch Paradigm will be followed by a 
Sctiption of its conceptual context. 
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Description of the Research Paradigm: An 
Example 


Two subjects (Ss) separately learn a 
standard multiple-cue probability-learning 
task; such tasks typically require S to infer 
the various scale values of a criterion variable 
from data provided by several cues, each hav- 
ing a different probability relation to the 
criterion variable, Ss are trained in such a 
way that each S develops a different set of 
cue dependencies; thus, for example, S, learns 
to depend most heavily on Cue A, less heavily 
on Cue B, and least on Cue C; S, learns the 
reverse set of dependencies. Each cue de- 
pendency may be specified by the experi- 
menter (Æ). This procedure not only assures 
that each S will have something to learn 
about the other, but also provides a means of 
specifying precisely what and how much S 
will have to learn. It also permits the speci- 
fication of whatever similarities and differ- 
ences between the Ss the investigator wishes 
to establish. In effect, the Ss are “reared” in 
two different settings which Æ may arrange 
to suit his purposes. 

Following their separate training, Ss are 
required to work together on a task which 
appears to be identical to the training task; 
they are not informed of their differences in 
training. Unknown to the Ss, the new com- 
mon task has a set of cue validities which is 
different from those learned by either Sı or 
Sə. The statistical properties of the new task 
may be arranged by the investigator to suit 
his purposes, that is, the common task may 
have cue validities exactly midway between 
those learned by Sı and So (in the example 
given above, Cue A, Cue B, and Cue C would 
be given equal weights in the common task) 
or the common task may be more similar to 
one S’s training than the other’s, and so 
forth. 

On each stimulus presentation in the new 
common task, both Sı and Sz make known 
to one another their individual judgments 
about the criterion variable and then agree 
on a joint judgment. A number of such trials 
involving first an individual judgment of the 
criterion variable and then a joint judgment 
provide the experience necessary for each S 


- to learn about the other. After these trials 
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the subject's achievement of the correc 
Ss $: of the distal variable under varios 
—_——_———- _ ———__ abject conditions. Such achievement 
i) (2) (3) (4) it possible for the organism to 


Own Prediction Own Prediction successful and stable way to a remo 
judgment for S, judgment for S, state of affairs, despite the uncertaint 
probability character of) the medi 


z a: aa * ditions (see also Heider, 1958), E 

Pt my ~ " lens model (Figure 2) illustrates th 

H H ` : acteristics. s 

pii ez = = In studies of multiple-cue probability 
A comparison of Columns 


and 3 and ing, S typically learns to predict (achi 
1 to Criterion Sree, Nerable from m-e ue 
extent which probability tween cues 
p x aae Aa krreiged to si 
to the various task stimuli as a result of his Menter’s purposes (see, for example, Pet 
7 both growth Hammond, & Summers, 1965), In Oa 

curves and asymptotic levels of interpersonal person-perception study, the person- 
learning may be studied under various condi- Presents cues (e.g., verbal material, 
tions, iognomic expressions) to an S t 
from these cues the person-object’s I 

Conceptual ethodological Framewor on a scaled distal variable (e.g., int 
‘me Ý > authoritarianism). The relation betwet 
unctionalism. Fundamental and distal variable is assumed to be 

to Brunswik’s (1952, 1956) approach is (a) abilistic in these studies and thus sus 


Achievement (rą) 


fen fsa 


a(n: 


'e3 r 


Distal 
Variable 


s3 


Cue Validity (re) Cue Utilization (r,,;) 
Cues (X;) 
Fic. 2. Brunswik’s lens model. 
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Fic. 3. Brunswik’s lens model adapted to interpersonal learning. 


to quantification in terms of a correlation 
Coefficient or other statistic appropriate to the 
data. The assumption of probabilism is con- 
sidered reasonable since no instances of uni- 
vocal cues have been found in studies of 
person-perception (see Brunswik, 1956; Crow 
& Hammond, 1957). 

Application of the Lens Model to the Study 
oj Interpersonal Learning. The derivation of 
interpersonal learning from the lens model is 
made apparent in Figure 3. Note that S; must 
cope with a probabilistic environment (ie., 
task) with its distal variable and the un- 
certain data produced by it. Thus, S must 
make an inference from effect (the cues) to 
cause (the distal variable). In interpersonal 

ing, S, must cope with an additional 
Probabilistic system, S,, which is interacting 
With the task, and here he must make an in- 
ference from cause to effect, that is, from the 
Cues (X; in Figure 3) to the effect they pro- 
duce (Sps response, Y in Figure 3). This is the 
Undamental nature of interpersonal learning 
tom the point of view of probabilistic func- 
tionalism, 
Lens-Model Equation. Organizing data in 


3This equation has been simplified by Tucker 
(1964) to read: 
fa = GRR, + C VI- Rè V1- R? 


where G is a measure of the linearly-predicted com- 
ponent of the S’s response variable in relation to 
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— Bu) where ra= the correlation between 
cue, and the variable estimated, ru = the cor- 
relation between cue, and the S's judgment, 
Ba = the beta weight for the correlation be- 
tween cue, and the variable estimated, and 
Bu = the beta weight for the correlation be- 
tween cue, and the S's response; and C = the 
correlation between the variance unaccounted 
for by the multiple correlation in the ecology 
and the variance unaccounted for by the 
multiple correlation in S's response system. 

Use of the Lens-Model Equation. The lens- 
model equation may be used to analyze in 
detail the formal, mathematical determinants 
of achievement with respect to interpersonal 
learning. Equation 1 shows that achievement 
(Fa), that is, correct prediction of S,’s re- 
sponses by S,, is a function of the properties 
of both the learner’s prediction system and 
the person-object’s response system, as 
previous empirical research in int 
perception has suggested it should be (Baker & 

Block, 1957), and as Brunswik (1943, 1956) 
indicated it would be. 

Equation 1, for example, gives equal status 
to both subject and object; achievement is 
denoted as a function of (a) the linearity 
(Rp) of the person-object’s system of re- 
sponses to the environment, that is, to the 
physical learning task, (6) the linearity 
(Rf) of the learner’s prediction system for 
the person-object, (c) the extent to which the 
weights given in the cues by the learner ap- 
proximate the weights given the cues by the 
person-object (Xd), and the (d) extent to 
which the learner correctly infers the non- 
linear characteristics of the person-object’s 
response system (C). 

Equation 1, therefore, indicates that there 
are four factors determining the S’s achieve- 
ment over a series of trials: (a) a given S 
may fail to have high achievement because 
his prediction system is highly linear al- 
though the response system of the person- 
object is not (eg., R? > .70, Rp < .30); 
or (b) because the S’s prediction system is 
not linear (R; < .30), when the response 
system of the person-object is (Rp > .70); 
or (c) because § cannot match his cue 
weights with the person-object’s cue weights 
(Xd is large); or (d) because the S is not 

able to correctly identify the nonlinear char- 
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acteristics of the person-object’s respo 
system (C < .70) when the nonlinear € 
acteristics of the person-object’s resp 
system are important (when V1 — Ry 
large). (For a more detailed discussion | 
these equations the reader should ¢ 
Hursch et al., 1964, and Hammond, 
and Todd, 1964.) 

The above analysis of the formal deten 
nants of achievement gives concrete expr 
sion to Brunswik’s (1957) insistence 
equal status for subject and object in psychi 
logical research—particularly research dealing 
with interpersonal perception and learni 
thus, 
... Our point is to restore or establish the p 
equality of standards in the treatment of orga 
and environment—that is, the equality of 
and situation (or object) in which equal justice $ 
done to the inherent characteristics of the ongal 
ism and environment [p. 6]. 


The symmetry in Equation 1 accomph 
this purpose. 

Summary. Probabilistic functionalism, W 
gether with the multiple regression analysis 
of the lens model, provides a research con 
for the study of interpersonal learning 
which experimental variables may be quant 
tatively controlled and made subject @ 
mathematical analysis, and in which general 
zation may be tested. "i 

Analyzing interpersonal learning by më 
of the above equation within the frame 
of the lens model meets the criteria 1 
traditional experiments; experimental 
ables are denoted, quantified, and 1 
variation specified. That is to say, (a) thi 
statistical characteristics of the physi 
multiple-cue probability-learning task i 
under the experimenter’s control and he m 
range these to suit his purposes (see left 
of Figure 3); (b) the statistical charactet 
istics of the learner’s prediction system | 
the person-object’s response system, alth 
not under restraint, are denoted, quant 
and their variation ascertained; (c) 
the response-system characteristics of 
learner and person-object are randomly 
pled insofar as the persons themselves # 
random sampling units, generalizatiol 
populations with such response-system 
acteristics is therefore possible. 
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We turn now to an illustration of the ap- 
pikation of the research paradigm described 
above. 


ILLUSTRATION OF THE USE OF THE INTER- 
PERSONAL LEARNING RESEARCH PARADIGM 


Method 


Fifty pairs of Ss were individually and 
separately trained to infer a criterion vari- 
able from three probabilistic cues. For Sı of 
tach pair, Cue A correlated .80, Cue B cor- 
related .50, and Cue C correlated .20 with 
the criterion to be estimated from these 
cues. For S+, the cue weights were reversed ; 
Cue A correlated .20, Cue B correlated .50, 
and Cue C correlated .80 with the criterion 
variable. Following separate training, the two 
Ss were brought together to work on a com- 
mon task which appeared identical to their 
training task, but which actually had cue- 
qtiterion correlations of equal values (.5). 
Neither S was informed of the empirical 
facts of their training or of the joint task. 
As a result, when the Ss arrived at different 
judgments concerning the criterion variable 
(as they did because of different training) 
such differences were attributed by them to 
faulty judgment on the part of the other 
Person or themselves, During the first 80 
joint trials, Ss wrote down their private 
judgments as to the criterion values, then 
agreed to a joint decision for the criterion 
value. 

In order to discover to what extent the 
Ss had learned what response the other per- 
son would make to a given stimulus con- 
figuration, on the final 20 trials each S pre- 

what judgment his partner would make 
for each stimulus presented. In this study, 
the final 20 trials (unknown to S) were 
Tepetitions of the previous 20 trials (thus 
Providing a measure of S’s reliability over 
trials), Because S, was not specifically in- 
structed to observe and learn to predict S2’s 
esponse at the outset of the experiment, this 
Study might best be described as one in- 
volving incidental interpersonal learning. This 
Procedure was chosen because it is more 
"epresentative of normal conditions than 
‘structed interpersonal learning would be. 
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Procedure 
Materials for the training tasks consisted 
of two decks of 100 cards each. Three curs 
were pictured on each card: a circle, an 
angle formed by two lines intersecting on 
the circle, and the point at which the two 
lines intersected the 
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deck, the cues had negligible intercorrelation. 
The combined predictive validity of all three 
cues accounted for 83 
the criterion variance (the multiple correla- 
tion of cues with the criterion variable 
was .91). 

By way of introduction 
task, Ss were shown sample cards on 
the various cue values were 
were shown the various sizes which 
and angle could assume, and they 
structed to interpret the point of intersect 
according to an imagined position on a clock 
face. For each of the training trials proper, 
Ss were told to look at the pictured cues, 
write down their judgment about the criterion 
(on the basis of the cues), and then to turn 
the card over and examine the actual 
value. The S’s task, then, consisted of de- 
veloping a method of interpreting the cues 
so that he could accurately predict the cri- 
terion. High achievement was possi ey a 


iving at 
Half i the Ss learned one set of cue depend- 
encies while the remaining 
reverse set of cue dependencies. 

After being trained individually on the dif- 
ferent series of trials, Ss were brought to- 
gether in pairs to work on a common task. 
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TABLE 1 
Scores TO BE ANALYZED 
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. Reliability of S: 


. Predictive accuracy: 
. Actual similarity : 

stimuli over trials 81-100. 

6. Assumed similarity : 


The common task consisted of 80 trials in 
which the three cues used earlier now had 
equal (.5) predictive validity for the cri- 
terion. These 80 trials consisted of four blocks 
of 20 trials each; and, as in the training 
task, in each block the cues had negligible 
intercorrelation with one another, and when 
combined, accounted for 83% of the criterion 
variance (R = .91). 

Ss were instructed to first write down their 
private judgment, then to discuss their dif- 
ferences and come to a joint decision; E 
recorded the joint decision and then informed 
the Ss of the correct answer. No time limits 
were given and Ss were encouraged to discuss 
the situation as fully as they wished. An 
incentive of $30.00 was offered to the pair 
of Ss who were most accurate in their joint 
decisions; pairs of Ss were therefore seen to 
be competing against one another for this 
cash prize. 

Upon completion of the joint task, Ss were 
informed that they would now be required 
to predict the response which they thought 
their partner would give to the stimuli. They 
were also instructed to write down the re- 
sponse they would have given if they had 
continued to use their previous system (ie., 
not to indulge in any new speculations about 
the task solution but to remain as consistent 
with their former responses as possible). 
During this part of the experiment, Ss were 
not told the correct answers and were not 
allowed any further discussion, 

Trials on the prediction task were a repeti- 
tion of the last 20 trials on the common task; 
none of the Ss verbalized their awareness of 
this experimental arrangement. 


1 comparison of S;’s responses on trials 60-80 with S;’s responses over trials 81 
2. Linearity of S's response system: multiple R? of S's response over trials 81-100. 

3. Variability of S's response system: o? of S's response distribution over trials 81-100. 

+ correlation of S,'s predictions with S;'s responses to the task stimuli over trials $ 
5 correlation between S's responses to the task stimuli and S;’s responses to the 


correlation between Sy's responses to the task stimuli and S;’s predictions of Sj's sp 
to the task stimuli over trials 81-100. 


7. S's accuracy score for the physical task: correlation of S’s responses over trials 81-100 with the correct 


Basic Data 


Table 1 provides a partial list of the b 
data obtained by the above procedures 
may be noted that each score represen 


sponse and the criterion values. The corte 
tion coefficients representing all scores 1 
transformed to Fisher Zs and all fu 
intercorrelations (among scores) were € 
puted between the Zs. 


Results 


Effectiveness of Differential Training 
order to be assured that Ss entered the 
mon task with different cue dependencies 
achievement during training was exami 
First, each of S’s responses during each b 
of 20 trials was correlated with the criterio 
value and with the cue value. Ss wee 
found to have significantly increased | 
level of achievement throughout the 
blocks of training (p < .001); there wi 
significant difference in achievement bet 
Ss who had been trained to rely 
cues in different ways. The mean 
ment over all Ss was .63 (Z) for 
training block. Di. 

Information Provided by the Compo 
of the Lens-Model Equation. It will 
called from Equation 1 that the dete 
of accuracy are: (a) amounts of 
in the learner’s prediction system (Rj 
(b) the person-object’s response 
(Ryo), (c) degree of matching of the 
assigned to the cues (3d) by each 
and the correlation between residual, 
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linear variances (C) in both the learner's 
prediction system and the person-object’s re- 
sponse system. The relative importance of 
these determinants will, of course, depend 
upon their quantitative significance. If, for 
example, the person-object’s responses show 
very little linear covariation with the three 
task cues, then the linearity of the learner’s 
prediction system will not substantially con- 
tribute to the accuracy of his predictions, 
If S, assumes that Sə bases his judgments 
mainly on the same cue which S, was trained 
to rely on, Xd will be large and achievement 
will be low. On the other hand, if there is a 
large nonlinear component in S's prediction 
system which S, detects and utilizes, C will 
be large and achievement enhanced. 
Presented below are results concern- 
ing achievement, discrepancies between cue 
weights (Xd), and the effects of the linearity 
(R?) and nonlinearity (1 — R?) in the pre- 
diction and response systems of the learner 
and person-object, respectively. 
Achievement. Ss’ mean achievement (Z) 
Was .53 (o: = .02) for the last block of trials 
on the common (physical) task, Mean 
achievement in predicting the other per- 
son’s response was slightly higher (2 = .62, 
%=.02) and the difference is statistically 
significant (p < .01). Incidental learning of 
the other person was thus slightly better 
than directed learning of the physical task. 
Differential Cue Weights (3d). A direct 
test was made concerning the extent to which 
tach S was different from his partner. Had 
tach S in fact learned to make his inferences 
on the basis of a different set of cue weights? 
For each experimental pair of Ss, Sd was 
computed for each block of training trials. 
Significant increase in 3d was observed as 
4 function of the training (p < .001); it may 
therefore be concluded that Ss initially faced 
Partners in the common task with sub- 
tially different cue dependencies. Such 
analyses of 3d for each pair permits the 
rmination of the difference between the 
sue dependencies of the two Ss at any point 
eir training. Similarly, it is possible to 
wei ne the difference between the cue 
ed Si assigns to S2’s response system and 
actual cue weights employed by S2. 
nearity (R*), It should be noted first 
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that the R? between the cues in the physical 
task and the criterion variable was 83. Ss 
were thus placed in a task environment which 
should have taught Ss to develop a high 
multiple R in their response systems. The 
mean linearity of the Ss’ response systems, 
however, was only .52 (on the last block of 
trials), It had been hypothesized that person- 
objects who were highly linear would be 
easier to learn about and thus S, would be 
more accurate in predicting a highly linear 
So's response than a nonlinear S's response, 
This hypothesis was not strongly supported; 
the correlation between linearity (Rp) and 
accuracy (fa) was 43. 

Nonlinearity (C). The fact that many Ss 
had not developed a high degree of linearity 
(R?) in their response systems made it pos- 
sible to ascertain whether the learner would 
be able to detect and utilize whatever sys- 
tematic, nonlinear variance existed in the 
person-object’s response system. Nearly half 
(N = 43) of the learners were able to predict 
the person-object’s response system better 
than a multiple regression equation; the mean 
C equalled .25, which although small, is 
significantly different from zero ($ = 001). 

‘An example of an S’s accurate utilization of 
a person-object’s nonlinear variance is illus- 
trated by S Number 20. Responses of the 
person-object correlated —.13, +.16, and 
—.33 with the three task cues respectively 
and his Ryo? was only .16. The learner’s pre- 
dictions correlated +.10, +.18, and —.26 
with the three cues, suggesting that S Num- 
ber 20 detected a nonlinear use of the cues 
by the other person, Moreover, the learner 
was able to discern the form of the person- 
object’s nonlinear use of the cut data; C for 
S Number 20 was .86. As a result, the accu- 
racy of his prediction was very high (.84). 
The full data are presented in the equation 
below. 


_ Ri + Ry Zd 


Ta 3 
+CV(i— R?) V (1 — Rp) 
.09 + .16 — .06 
fy ie a 
+ 86 V (1.00 — .09) V (1.00 — .16) 
Ta = .84 
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Results Related to Studies of Interpersonal 
Perception, Because rather close theoretical 
and methodological relations might be ex- 
pected between interpersonal learning and 
interpersonal perception, several major re- 
current problems inherited from the field of 
interpersonal perception will be examined in 
terms of the paradigm presented above. These 
issues concern such aspects of interpersonal 
perception as: (a) the reliability of the 
person-object, (b) the role of the learner’s 
response invariance, (c) the influence of simi- 
larity between learner and person-object upon 
accurate interpersonal perception, and (d) 
the generality of the learner’s predictive 
success, 

Reliability of the Person-Object. Certain 
person-objects should be easier to judge than 
others simply because they are more regular 
or reliable in their responses. In recognition 
of this assumption, the reliability of the 
person-object has been used as a basis for 
selecting person-objects to be included in 
the sample. Interestingly enough, Kremers 
(1960; cited in Krech, Crutchfield, & Bal- 
lachey, 1962) could find only one person who 
met his criteria for reliability; this one 
person was then used repeatedly for all 
judges, thus defeating all claims to person- 
object representativeness, Cline and Richards 
(1960) found 10 persons out of 25 who 
satisfied their criteria for reliability and used 
these 10 as stimulus objects, thus biasing the 
sample in the direction of high reliability. 
Pyron (1965) found that Ss could not ac- 
curately predict the responses of (artificial) 
person-objects who were consistent 66% of 
the time. 

Results of the present study indicate that 
the role of reliability is not as outstanding 
as might have been suspected. The correla- 
tion between the reliability of the person- 
object’s responses (see Table 1) and the 
learner’s success in predicting those responses 
was only .41. Thus, it cannot be taken for 
granted that reliability is of major signifi- 
cance in all prediction tasks. Our conclusion 
is that reliability is a topic which deserves 
further investigation. 

Stereotyped and Differential Accuracy, Re- 
search in the 1950s indicated that “stereo- 
typed accuracy” was the principal reason for 
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whatever accuracy S was capable of (eg, 
Cronbach, 1955; Hammond, Kern, Crow, 
Githens, Groesbeck, Gyr, & Saunders, 1959). 
When stereotyped accuracy is subtracted 
from S’s score as being somehow artifactual, 
there is little evidence that humans differen- 
tiate among one another with much accuracy. 

In the present study, it was not 
to measure directly the extent to which $s 
relied upon stereotypes in their prediction 
system, since each § predicted the responses 
of only one person-object (his partner). How- 
ever, it was possible to examine the extent 
to which S’s predictions applied to his 
specific person-object more precisely than to 
all other Ss participating in the experiment. 

Each of S’s predictions (made with refer- 
ence to a specific person-object) was corre 
lated with the responses of the remaining 
(N—1) Ss, thus providing a 100 x 9%9 
matrix of correlation coefficients which were 
transformed to Fisher Zs. The mean of this 
population (2 = .37) was compared with the 
mean (Z= .62) of the sample of actual pre 
dictions (i.e., correlations between predictions 
and responses of experimental pairs only) and 
the difference was found to be highly sig- 
nificant (p< .001). Thus, it can be con- 
cluded that S’s successes in predicting his 
specific person-object’s responses were sig- 
nificantly superior to those “successes” gen- 
erated through systematically comparing 
Ss’ predictions with all person-objects’ Tè- 
sponses. This result may be interpreted 
evidence in favor of the hypothesis that dif- 
ferential learning occurred. 

Similarity. The similarity between Pe 
ceiver and perceived was considered to 
an important factor of interpersonal percep- 
tion almost from the start. Some investigator 
(e.g., Newcomb, 1958) have gone so fat k 
to suggest that similarity is the definitive 
quality of the interpersonal perception proc- 
ess. Newcomb (1958) further argues thé 
“. . . accurate communication tends to a 
sult in increased attitudinal similarity . - ai 

Tn order to evaluate Newcomb’s contention, 
the average similarity (Item 5, Table 1) 
existing between members of a pair was com 
pared with average similarity among all 5s 
in the experiment. When all Ss’ responses are 
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 jntercorrelated, the resulting mean similarity 


is 30, whereas the mean similarity among 


C experimental pairs is .56. Thus, Ss are sig- 


aificantly more similar to their own partner 
than to other Ss. This similarity is solely 
attributable to Ss’ interaction with one an- 
other, for it will be recalled that Ss entered 
into the interaction task with substantially 
differént cue dependencies. In short, Ss do 
become more similar to one another as a 
consequence of their interaction in a task 
situation of this sort. 

Generality versus Specificity of Accuracy. 
The question of generality is important in 
its own right, but it is also fundamental to 
all research concerning the personality cor- 
relates of the ability to judge, cognize, or 
perceive person-objects correctly. For if such 
ability is not general over various person- 
objects and/or task characteristics to be 
judged, then, of course, the search for per- 
sonality determinates of a generalized ability 
must be abandoned. Unfortunately, present 
results concerning a generalized ability are 
equivocal, and Krech, Crutchfield, and Bal- 
lachey (1962) have concluded that “... - 
there is no clear evidence for a generalized 
a to perceive others correctly . . . [P- 

Because each S in the present study judged 
only one other S, no evidence was provided 
concerning the problem of whether ability 
to perceive others accurately generalizes over 
person-objects. However, the present study 
afforded a good opportunity to obtain in- 
formation concerning a generality problem 
of a broader nature, that is, whether the 
ability to learn a physical multiple-cue prob- 
ability task is correlated with the ability to 
learn a “human multiple-cue probability 
task.” The correlation between achievement 
on the physical task and achievement in pre- 
dicting the person-object’s response was .43. 
When the similarity between the judge’s own 
Tesponse and the person-object’s response is 
held constant, however, the correlation drops 
D 03. Thus, proficiency on the physical task 
did not generalize to interpersonal predictions 
a the person-object shared the same 
vel of task achievement, that is, was in fact 
Similar to the judge. 
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Summary 


The research paradigm presented in this 
paper will permit the study of interpersonal 
learning. The need for such a research para- 
digm was discussed in view of the long post- 
ponement of the investigation of this im- 
portant social process. The conceptual and 
methodological context was explained and 
methods of analysis were applied to an em- 
pirical problem as an illustration of what can 
be learned from the use of the paradigm. 

The most fundamental test the above para- 
digm must meet is whether it does indeed 
provide an adequate model of the process of 
interpersonal learning. In the form presented 
above, the paradigm is of course highly 
schematic; it is no more than a prototype 
of a model from which a more complete model 
of interpersonal learning may be derived. 
But the importance of a correct (or at 
least useful) prototype can hardly be over- 
estimated. The traditional conceptual and 
methodological approaches have not provided 
a prototype within which interpersonal learn- 
ing can be studied. The above material is 
offered as evidence that the model of inter- 
personal learning described here is an 
adequate point of departure for the study 
of this most important aspect of human 
interaction. 
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WHAT WORDS ARE STUTTERED? 


INSUP K. TAYLOR * 
Lakeshore Prychiairic Heapltal, New Torosio, Conade 


A number of studies on the properties of stutionsd words are richly 
reviewed. The factors comiidered are iskisi spssd gectthon of kitos be 


information for all these factors, and an 
plexities for the comonant-vowel efect, are dawed w 

mechanisms. An analogy in underlying mecdasims is suggeted 
hesitation pauses in normal speech and stuttering. 


What is stuttering? Johnson and Brown mentioned 
(1935) give the following definition of “stut- factor by factor, Possible underlying mecha- 
tering spasm” as a footnote: nisms for the four factors are discussed. 


A stuttering spasm was taken to be any interrup- 
thea of the normal rhythm of the reading. It might 
take the form of a complete block, undue 
then of a sound, a repetition of the initial of 
a word or syllable, saying “uh-ub-uh,” repetition 
of the previous word or words, or a complete 
@emation of all attempts to speak for a moment 
Ip. 484). 


Stuttering has rarely been defined in subse- 
quent studies reviewed, perhaps because the 
authors have assumed that the above defini- 
tion is more or less obvious to readers, Usu- 
ally high inter- and intrarecorder agreements 
indicate that such an assumption may be 
justified. 

Stuttering does not occur randomly. Stut- 
terers as a group seem to consistently have 
more difficulties with some words than with 
Others, and a given individual shows a tend- 
ency to stutter at the same places on suc- 
cessive readings of the same material. Many 
Studies have been concerned with identifying 
the properties of words associated with con- 
sistently greater frequency of stuttering. In- 

grammatical 
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itial sound, length, position, and = Hockett (1958) lists m per i N, Izl, 
dass of words have been singled out and IN. Pohl, ok Od od man At he ads 
shown to affect stuttering. These four factors as consonants. Only Soderberg (1962) eliminated 
sem to account for most of the word- them from his study. The four phonemes /w/, /b/, 
dependent difficulties in stuttering. /wh/, and A = porny bie aya 


In this paper, studies on the four above- 
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i The author wishes to thank K. K. Neely and in stuttering, perhaps reflecting their affinity with 
M. M. Taylor of Defence Research Medical Labora- vowels. Thus, ee eee ee — 
tories, Toronto, for their critical readings of this pho are unlikely to change the 
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analysis, No attempt was made in the John- 
son and Brown study, either in the design or 
in the analysis, to control confounding of 
the initial-sound effect with the other subse- 
quently demonstrated effects. 

Brown (1938a) attempted to isolate the 
effect of initial sound from the other effects 
by using an arrangement of words in random 
order instead of running text. With this 
material, he corroborated the findings of his 
1935 study with Johnson. Comparison of the 
results of the two studies yields a rank-order 
correlation of .68 among the sounds. It is 
doubtful that having words in a random ar- 
rangement actually isolates the initial-sound 
effect: length and possibly some grammatical 
effects may still covary with the sound 
whether the words are in text or not. Associa- 
tional effects must also be considered, 

Hahn (1942), in an experiment on the 
effects of changes in social complexity upon 
stuttering frequency, also found a range of 
difficulty in sounds and obtained an ordering 
similar to that of Johnson and Brown, with 
vowels being easier than consonants. Here 
again, no attempt was made to control other 
possible factors. 

Quarrington, Conway, and Siegal (1962) 
sought to investigate simultaneously the ef- 
fects of position, grammatical form, and 
initial sound by using a series of sentences 
in which specially selected words displayed 
the various combinations of conditions to be 
examined, An attempt to control length was 
not too successful, as words belonging to 
different grammatical classes differed in length 
as well. They selected eight consonants, four 
associated with high and four with low fre- 
quency of stuttering according to Johnson 
and Brown’s rankings, and found no signifi- 
cant difference between the two groups of 
sounds, In retrospect, the selection of sounds 
was unfortunate, since most studies have 
shown that the differences among sounds are 
more clear-cut between consonants and vowels 
than within vowels or within consonants. 

Soderberg (1962) used a procedure slightly 
different from that of the above investigators 
in selecting reading material, and, in contrast 

to their results, found that there was no 
significant difference among vowel, voiced 
consonant, and voiceless consonant in either 
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frequency or duration of stuttering is. 
stances. Soderberg equated the three lits 
of words on several relevant factors, He 
also eliminated a number of semivowes 
and consonant clusters from the test me 
terial. Scrutiny of Soderberg’s test material’ 
suggests that perhaps the discrepancy be 
tween his results and those of other investi 
gators comes from the facts that (a) withia 
each of his treatment lists the sounds were 
used with different frequency, and it seems 
that the consonants which happened to be 
difficult for the majority of his subjects 
were under-represented while the easy conso- 
nants were over-represented, and (b) differ- 
ent sounds were not equally used at each 
position in the phrases of his lists, Thus, 
an interaction between position and sound 
effects may further change the difficulty level 
of each list. Both of these effects are such 
as to depress the difficulty levels of two con- 
sonant lists as a whole and make the differ- 
ence between consonant and vowel smaller 
than it might be. 

Taylor (1966) used an Er lish prose text 
as reading material. No factor was arbitrarily 
restricted in its range of variability and no 
concurrent factors were separated unnatu- 
rally. An iterative analysis of proportions 
transformed to logits (Maxwell, 1961) 
showed the initial sound effect—in this case, 
only the consonant-vowel difference Was 
tested—to be statistically significant (f 
< .001) and to exist independently of the 
other factors. Moreover, this effect is stronge 
than that of any other factor, being twice 4 
important as position, and about seven times 
as strong as length, though the measured 
effect of length presumably varies with the 
difficulty of the text. A 

The above studies show that the initial 
sounds give words a range of difficulty for 
stutterers. What is the nature of 
differences in difficulty? 

Ranking of Consonants in Difficulty Doe 
Not Seem to be Stable. There is large vat 
ability among subjects in ranking consonants: 
Johnson and Brown (1935) noted the widely 
differing difficulties of the consonants for dif 

8The author wishes to thank Dr. Soderberg {0° 


a copy of his test material (personal communication: 
November 6, 1964). 


subjects, with 38 randomly selected cor- 
between subjects’ rankings ranging 
| —38 to .84. Brown (1938a) obtained 
arable correlations between subjects 
unconnected material ranging from 
#1 to .89, with a median of .14. Hahn 
42) stated that individual stutterers vary 
iy on sounds associated with stuttering 
the amount of stuttering on specific 
Quarrington et al. (1962) found an 
between Subject X Initial Con- 
Phonemes (p < .01) indicating that 
nt subjects ranked the difficulty of 
nants in different ways. Rankings of con- 
is by nine subjects in Taylor’s (1966) 
y yielded W, the coefficient of concord- 
ge (Siegel, 1956), of .20 (.01 < p < 05). 
the different studies, a significant 
border correlation on the ordering of 
nant difficulty (.64, p < .01) is found 
aly between Johnson and Brown (1935) and 
m (1942). Rank-order correlations be- 
i Johnson and Brown (1935) and Taylor 
) and between Hahn and Taylor are 
, but not significant, being .30 and .14, 
spectively. 
lí intersubject variability is large in rank- 
onant difficulty, good agreement 
ifferent investigators is not to be ex- 
Conditions of procedure, including the 
number of subjects, and measures of 
lian percentage of stuttering were more 
ilar between Johnson-Brown and Hahn 
in between either of these studies and 


The Difference Between Consonants and 
fwels Ts Clear-Cut, and Appears in Most of 
dj Studies. Though Johnson and Brown 
85) and Taylor (1966) do not agree on 
Tank orders of difficulty within consonants, 
Y do agree that consonants as a whole are 
ered more than vowels, Brown (1938c), 
fing to the 1935 Johnson and Brown 
States that 

group as a whole, however, and in the great 
ity of individual cases it was found that 
hants were more difficult than vowels, there 
‘almost no overlapping in the case of the 
‘Tank order of difficulty [p. 225]. 


ylor’s study, 14.5% of initial con- 
“+ events but only 2.7% of initial vowel 
its Were stuttered. When the sounds were 
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ranked, there was no overlapping in the ranks 
between consonants and vowels. Hahn also 
stated that stuttering occurred more pre- 
dominantly on consonants than on vowels, and 
considered only consonants for her analysis. 
Position 

Brown (1938b) reported a decreasing 
gradient of stuttering within both sentences 
and paragraphs. No systematic control of the 
other factors was attempted. Also, he analysed 
only stuttering events on the first, second, and 
third words of sentences. 

Quarrington et al. (1962) obtained an effect 
significant at p < .005 for the position effect. 
In their case, the contrast was between the 
initial and terminal words of six-word sen- 
tences. Conway and (1963) 
investigated the effects of sounds and positions 
with length and tical category held 
constant. They limited the initial sounds 
used in their experiment to four consonants 
which Johnson and Brown (1935) associated 
with low frequency of stuttering, the gram- 
matical class to nouns, and the length to two- 
syllable words. In this situation, the mean 
frequency of stuttering on critical words ap- 
proached a decreasing linear function of 
position, coded as initial, medial, and terminal 
in a seven-word “sequence.” The experimental 
variables, initial sound and position, were 
effectively singled out from the length effect 
and the effect of grammatical category, but 
can one generalize the findings beyond the 
limited range of the controlled factors? From 
other evidence, there seems to exist consider- 
able difference in difficulty between consonants 
and vowels, possibly also between function 
and content words, and between short and 
long words. In each case, Conway and Quar- 
rington’s material contained only one member 
of each pair of the dichotomized variables. 
In a still later study using prose reading 
material, Quarrington (1965) found that the 
product-moment correlation between stutter- 
ing and word position was — 49, indicating 
that the incidence of stuttering d 
with position in the sentence. As in Brown 
(1938b), no systematic control of other fac- 


tors was attempted. W 
(1966) study, where positions 


In Taylor’s 
were divided into initial, medial, and ter- 
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minal classifications, a significant position 
effect was found, independent of the other 
effects. Here, initial and terminal positions 
referred to beginning and ending words of 
sentences and of phrases bounded by commas. 
The words inbetween were medial words. 
Position linear effect (first — last) was greater 
than position quadratic [ (first + last)— 2 X 
(middle) }, and the position effect as a whole 
was larger than the length effect but smaller 
than the consonant-vowel difference effect. 
The effect of position, in s scores, was in- 
dependent of severity of stuttering, as is shown 
in Figure 1. Stuttering probability is shown 
for each of the first nine sentence positions 
for the three severe stutterers and separately 
for the six mild stutterers. The figure shows 
the probability that the mth word of a phrase 
will be stuttered if no prior word in the 
phrase was stuttered. The probability of the 
initial stuttering event gradually goes down 
both for severe and for mild stutterers. 


Grammatical Class 


Brown (1937) analyzed the data from the 
Johnson and Brown (1935) study in terms 
of grammatical classes. Differences among 
median percentages of stuttering for eight 
conventional parts of speech were not great 
enough to be ent $ significant, but the 
present author finds good agreement among 
Brown’s subjects on relative rank of difficulty 
(W of .89, p < .001). In Table 1, eight parts 
of speech are arranged according to the magni- 
tude of median percentage of stuttering in 


PROBABILITY OF STUTTERING 


1 2 3 4 5 6 fy 8s 9 
POSITION OF WORD WITHIN PHRASE 


Fic. 1. The probability of stuttering at nth word 
of phrase with no prior stuttering in same phrase. 
(Normal transformation used for ordinate.) 
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TABLE 1 


Ranxs or Ercut Parts or Sreeca roa 
STUTTERING Peg ENCY 


=| 


once 


| Median percentage 
Í of stuttering* 


Based on 
we 


| adv erb* 

verb® 

| Seca em 
noun® 


noun 
adjective 
adverb 
verb 


pronoun 


conjunction | preposition 
preposition crite | 
article article 
* Overall difference among parts significant atp <A 
Column 1, according to rank orders for W 


xwa a= 
i 


in Column 2, and the rank orders obtained by 
Quarrington ‘et al. (1962) in Column 3. 

In the Quarrington et al. study, the fout 
top-ranking parts of speech from Browns 
median percentage rank were tested pele dii 
ferences in frequency of sentteri A Although 
the four grammatical classes were significantly 
different (p < .005) in frequency of stutte 
ing, their order of rank of difficulty was quilt 
different from that of Brown, as shown 
Table 1. 

Why is there such a discrepancy in n 
three sets of orderings of the same four pe 
of speech? The slight discrepancy in 4 
of the parts of speech in Columns 1 and 1M 
Table 1 perhaps reflects the unstable mature 
of the rankings, which are easily affi 
differences in methods of ranking. As for ti 
discrepancies between the rankings of Brows 
and of Quarrington et al., different Les 
confounding factors were left unco 
In the Quarrington et al. study, — 
class and length were not completely be 
founded in spite of the authors’ attempt t0 
so. In Brown’s study, the fact that 
grammatical classes assume particular oa 
tions in sentences was not considered. 
length and position have been shown to 
stuttering. The procedures in the two 
ments were somewhat different—in we 
rington et al. study, the test words 


equated in position and initial phonem® ‘a 
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the range of variability on other factors was 
earrower than was Brown’s. 

In Taylor’s (1966) study, there were not 
eoough long function words to permit the 
simultaneous inclusion of length and content 
function in the iterative analysis of propor- 
tions transformed to logits (again showing 
how length and grammatical classes are con- 
founded) so that the function-content effect 
could not be determined in relation to the 
other three factors. The point-biserial (rP5) 
correlation between stuttering and the con- 
tent-function word dichotomy was .12 ($ < 
0S), with content words being associated with 
higher frequency of stuttering. However, the 
content-function rPbs computed separately for 
words starting with consonants and those 
starting with vowels were essentially zero, 
indicating that the apparent effect of gram- 
matical class may be due to the tendency of 
function words to start with vowels. 

The conflicting results of different studies 
on grammatical class may be due partly to 
some differences in methods of classifying 
words and in experimental procedures and 
analysis, and partly to the fact that content 
words are more likely to be long, occupy 
eatlier positions in sentences, and start with 
Consonants. When grammatical classes are 
dichotomized into function and content words, 
and when this factor is separated from the 


other three factors, its effect has not been 
demonstrated. 


Length 


Brown and Moren (1942), in an experi- 
ment comparing the stuttering of words of 
different lengths, concluded that longer ad- 
jectives and prepositions are more difficult 
for Stutterers than are shorter adjectives and 
Prepositions. This result held whether syllable- 
letter, syllable, or letter categories were used 
< measuring word length. Here again, as 

Town points out, he could only speculate 
that “the factor of phonetic difficulty explains 
apr of the above relationships, but not all 

the variations can be explained in this 
si -” Only an indirect inference regarding the 
OR between these two factors was at- 
rit Pted: “Correlations of .48 and .68 be- 

èen the amount of stuttering on adjectives 
Prepositions of differing word lengths with 
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the average phonetic difficulty of the cor- 
groups were obtained [p. 157].” 


Taylor (1966) dichotomized her list into 
words with five or less letters and those with 
six or more. A significant length effect was 
found, long words being stuttered more than 


length effect with Taylor's 
smaller than either the sound or the position 
effect. In a text with more long words, the 
length effect would probably be more 
nounced. 


Four Factors Together 


The question Brown (1945) put forward 
was: 
... whether these rather obvious characteristics 
[considered above] are the only important determi- 
nants of the loci of stuttering, or whether there are 
other, possibly more subtle, attributes of words 
which serve as stimuli which may elicit stuttering 
[p. 181]. 


The 32 subjects used in his experiment 
designed to investigate the above question 
stuttered 9.79% of the total words read. Brown 
chose 9.7% as a cut-off point for classi 
the initial sounds and 
into plus and minus categories. Brown does 
not indicate whether or not the 9.7% cut-off 
point divided the initial sounds into conso- 
nants and vowels. He states that “The four 
parts of speech which occasioned more than 
average [more than 9.7 per cent] stuttering 
were given plus ratings. . . . These were ad- 
jectives, nouns, paid oan rig 182]. 
Considering the position effect, Brown as- 
signed sia to the first, second, and third 
positions of sentences, and minuses to all the 
rest of the positions. Finally, each word of 
five or more letters (the average length of all 
words in the reading material being 4.65 
letters) was given a plus, and of four or 
fewer letters, a minus. 

Brown found an almost perfectly mono- 
tonic functional relation between plus value 
and percentage of stuttering, characteristics 
with more plus values having a higher per- 
centage of stuttering. Thus, he concluded that 
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these four factors (initial sound, length, posi- 
tion, and grammatical class) are adequate to 
account for the loci of stuttering. The results 
also seem to demonstrate that the four factors 
contribute to stuttering in additive fashion. 

These results should be used with caution, 

as the procedure seems to have been tautolog- 
ical: if plus values were given to some initial 
sounds and some grammatical classes be- 
cause of their above the (9.7%) average 
frequency of stuttering, naturally the words 
having these two characteristics would be 
found to be associated with high frequency of 
stuttering. If Brown had dichotomized these 
characteristics into two categories on a priori 
grounds—say, initial sounds into consonants 
and vowels, and grammatical classes into con- 
tent and function words—the study would 
have been more meaningful. 

In Taylor’s (1966) study, the factors were 
included in the test material in a natural way, 
and the analysis examined the interrelation- 
ship pattern among three of the four factors. 
To summarize Taylor’s findings, (a) each 
effect was statistically significant, (b) each 
effect was demonstrated independently of 
other effects, (c) the relative contributions of 
the factors to stuttering were different, with 
the greatest contribution coming from con- 
sonant-vowel difference, next from position, 
and then length, (d) in magnitude of con- 
tribution, the consonant-vowel difference was 
about twice as large as position and about 
seven times greater than length, and (e) these 
factors seemed to account for most of the 
stuttering dependent on words. 


STUTTERING PATTERN IN INDIVIDUAL 
STUTTERERS AND IN VARIOUS 
SITUATIONS 


In the experimental procedures commonly 
adopted for studying the four factors, the 
analytic results may reflect only the behavior 
of those severe stutterers who contribute most 
heavily to the total pool of data. Taylor’s data 
were examined for such a possibility by con- 
sidering each factor in the stuttering pattern 
for severe and mild stutterers separately. The 
examination showed that each subject demon- 
strated the consistent stuttering pattern dis- 
cussed in this paper. 

Brown (1945; Figure 1, p. 187) plotted 
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curves of the relation between stuttering fre 
quency and plus values separately for three 
groups of subjects—severe, moderate, and 
mild stutterers. In Brown’s figure, the increase 
in stuttering frequency as the number of plus 
values of words increased was different for 
the three groups, but the pattern of increase 
was similar across groups. 

Finally, there are some social situations 
which stutterers find difficult. What happens 
in different situations to the word-dependent 
stuttering pattern that has been discussed? 
As in the case of individual stutterers, the 
pattern seems to be basically consistent from 
situation to situation, only the amount of 
stuttering being affected. For example, Hahn 
(1942) found that the relative standing of 
the sounds in rank order within her own ex- 
periment was more or less constant in spite of 
the increased amount of stuttering as the 
complexities of the social situation increased. 
In a number of successive readings, Johnson 
and Knott (1937) found that in the second 
reading of the same material, 72.7%, and in 
the tenth reading, 60.7%, of the stuttered 
words were the words which were stuttered in 
the first reading. In Conway and Quarring- 
ton’s (1963) study, the position effect was 
found to be similar in the three orders of 
approximation to English even though the 
amount of stuttering was different. In Hejna’s 
study (cited in Conway & Quarrington, 1963) 
on spontaneous speech, a position effect similar 
to that in the reading situation was found. 


UNDERLYING MECHANISMS: UNCERTAINTY 
AS A LINK BETWEEN HESITATION PAUSES 
IN NORMAL SPEECH AND STUTTERING 


In normal speakers, Brenner, Felstein, and 
Jaffe (1965) found that “non-Ah” disturb- 
ances are, in part, a function of the statisti 
constraints imposed by the structure, 0 
the language. According to Goldman-Eislet 
(1958), “hesitation pauses” in normal spe 
are one manifestation of the general blocking 
activity which occurs when organisms 3° 
confronted with a situation of uncertainty; 
that is, when the selection of the next step 1 
quires an act of choice. Hesitation pauses 
tended to occur before long words, before 
content words, and before words of high 0- 
formation value. She did not examine whether 
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these three effects can independently affect 
hesitations. As noted above, in normal prose, 
information value is confounded with word 
length and with the content-function dichot- 
omy, as is word length with the content- 
function dichotomy. 

In stuttering, the factors of position, 
length, and possibly grammatical class, plus 
initial sound of words have been shown to 
affect the frequency of stuttering. These loci 
of greater possibility of stuttering also appear 
to be the loci of great uncertainty. A stutterer, 
just as a normal speaker, might well have 
trouble in speaking or reading when there are 
many possibilities for the next word or when 
constructing and formulating ideas for a sen- 
tence are involved. But where the normal 
speaker merely hesitates, the stutterer’s speech 
is disrupted. What are some of the facts which 
suggest such a relation between loci of stutter- 
ing and of hesitation pauses? 


Uncertainty and Initial Sound 


For stutterers, there may be a range of dif- 
ficulty within consonants, but the ordering 
varies greatly among individuals. However, 
the consensus is that words starting with con- 
sonants are more difficult in general than 
words starting with vowels. The present dis- 
cussion therefore focuses only on the con- 
sonant-vowel difference. 

Consonants are distributed with more 
statistical uncertainty than are vowels, as is 
shown by deletion experiments (e.g., Miller & 
Friedman, 1957). While Goldman-Eisler did 
not study the vowel-consonant dichotomy, the 
Stuttering pattern here agrees with the general 
idea of uncertainty as an important variable 
in stuttering as in hesitations. 

_ It is a common finding that sounds in the 
initial position of words are stuttered more 
than are sounds at other positions. In Johnson 
and Brown’s (1935) data 92%, in Hahn’s 
(1942) 93.13%, and in Taylor’s (1966) 
7% of stuttering events occurred at the 
initial sounds of words. According to Miller 
(1963), consonants at the beginning of words 
are distributed with greater uncertainty than 
are those at the end of words. Only five dif- 
ferent sounds make up more than 50% of 
English final consonants, while eight are 
needed to comprise 50% of the initial con- 


sonants. Thus, listeners are more successful 
guessing final than initial consonants (Bagley, 
1900, cited in Miller 1963). Sumby and Pol- 
lack (1954) showed that the uncertainty of 
prediction is about 1.5 bits greater for the 
initial letter than for the mean of all the 
remaining letters in monosyllabic words. 


Uncertainty and Position 


Although the different studies reviewed 
herein agree on the existence of the position 
effect, the question of its underlying mecha- 
nism is far from being resolved. 

Brown (1938b) suggested that the first few 
words in a sentence are always conspicuous, 
not necessarily because of their meaning but 
because they introduce a new idea. Thus, 


. . . the greater prominence of . . . the first words 
of sentences, with the consequent great desire of the 
stutterer to speak fluently at these points, should 
form the basis of explanation of the present findings 
[p. 119]. 


Later, Brown (1945) proposed that the posi- 
tion effect is due to a gradient of meaning, 
the earlier words of sentences carrying a 
greater communicative burden. 

What is involved in introducing a new idea? 
At the beginning of a sentence, besides the 
process of choosing individual words, there is 
the process of constructing and formulating 
the sentence. Hence, the position effect occurs 
in contextual material. These two processes of 
selecting words and formulating the idea are 
conceivably more active at the beginning 
of sentences than of phrases. Accordingly, 
Taylor’s (1966) data were examined for such 
an effect, but no tendency was found for the 
initial words of sentences to be stuttered more 
than the initial words of phrases within sen- 
tences. 

Conway and Quarrington (1963) found the 
position effect even in noncontextual ma- 
terial. It is possible that the habit of reading 
ordinary textual sentences is generalized to 
reading material of different orders of ap- 
proximation to English. In other words, a 
stutterer confronted with sentence-like ma- 
terial to read presumably goes automatically 
through the decision processes of formulating 
a sentence and choosing individual words 
even when the words in the sentence are not 
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arranged in the usual way. The closer the ma- 
terial is to text, the stronger is this generaliza- 
tion of reading habit, with closer resemblance 
to text in degrees and forms of stuttering fre- 
quency as a function of position. Conway and 
Quarrington’s figure (1963, Figure 1, p. 301) 
supports this contention. 

Aborn, Rubenstein, and Sterling (1959) 
demonstrated that predicting deleted words in 
a sentence is most difficult at initial and final 
positions, Subjects can utilize bilateral con- 
straint to predict the medial position, making 
it the easiest position to predict. However, if 
in reading test material stutterer-subjects uti- 
lize only the preceding, and not much of the 
following context, then the medial position 
may not necessarily be easier to predict than 
the final, Quarrington (1965) found a prod- 
uct-moment correlation of .22 between word 
position and word predictability in sentences 
of assorted lengths in prose: word predict- 
ability increased with successive word posi- 
tions in a sentence. Further, he found that al- 
though both word predictability and word 
position independently account for significant 
portions of variance in stuttering incidence, 
the contribution of the latter appears to be 
more potent. Partial correlation between posi- 
tion and stuttering, with word predictability 
held constant, was —.45 (p< .001), and 
between word predictability and stuttering, 
with position held constant, —.25 (p < .05). 
Thus, Quarrington concluded that word posi- 
tion has a ip to stuttering in- 
dependent of word information. 

The finding that the position effect has in 
itself a stronger effect than the actual pre- 
dictability of the words in the sentence gives, 
paradoxically, considerable support to the 
thesis of this paper. In all their experience 
with language, speakers learn that earlier 
words of sentences tend to have high un- 
certainty. Stutterers will therefore approach 
the beginnings of sentences with a predisposi- 
tion to stuttering based on their expectation 
that the early words will be hard to predict. 
Generally speaking, this expectation will be 
upheld, but sometimes early words are easy, 
while later in the sentence there are words 
harder to predict. These are approached with 
less predisposition to stutter—less anxiety, if 
you will—and the fact that they are in fact 
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harder to predict merely increases a now. 
small stuttering probability. An a priori anal- 
ysis cannot indicate whether the uncertainty. 
based predisposition to stutter early words 
has a stronger effect than the effect of the 
actual uncertainty of the words encountered, 
Quarrington’s results demonstrate that it does, 
at least for his subjects. 

In Goldman-Eisler’s (1958) study on pauses 
by fluent speakers, subjects used only un- 
ilateral contextual constraints (either for- 
ward or reverse) for predicting the words 
that preceded and followed pauses. She found 
that information suddenly increased by a 
ratio of 10:1 after pauses. Pauses tend to 
follow words of highest redundancy and pre- 
cede words of highest information. As to the 
question of whether or not Goldman-Eisler's 
pauses are equivalent to the grammatical 
pauses considered in the present paper, she 
stated that many of her pauses fell well within 
the boundaries of syntactic units, though some 
also occurred at clause junctures. Filled 
pauses, one of the four types of hesitation 
pauses studied by Maclay and Osgood (1959), 
were found to occur at the junctures of larger 
syntactical units as well as just before points 
of highest uncertainty. The authors suggested 
that at these junctures constructional decisions 
are being made as well as decisions as t0 
what to say. Cowan’s (1936) analysis of all 
pauses in reading showed that of 13 pauses 
longer than 1 second, 9 came between sen- 
tences. In every case, the pause between two 
sentences was longer than the average of all 
pauses in continuing speech. 

Grammatical Class and Uncertainty 


There are studies which indicate that com 
tent words carry more information than 
function words. Aborn et al. (1959) showed 
that mean percentage correctly predicted for 
deletions in sentences is far greater for funt- 
tion words and pronouns than for content 
words, In the Goldman-Eisler (1958) study» 
a greater frequency of pauses and of harder 
predictions were associated with content than 
function words. Unlike the Taylor (1966) 
study, the grammatical-class effect in these 
studies was not examined relative tO 
consonant-vowel effect. It was poin out 
above that the consonant-vowel effect a 


nts for the entire content-function effect 
stuttering difficulty. However, Taylor's 
material being an easy topical prose, the 
topical association (Garner, 1962, 

266) could have acted to reduce the un- 
i nty of content words. 


gth and Uncertainty 


There is a negative correlation between 
length of words and frequency of usage, 
ccording to Zipf (1935). Thus, longer words, 
hrough mediation of frequency of usage, 
ht be harder to predict when deleted in 
ntences, although no experiment directly 
investigating this relationship has been found. 
In the Goldman-Eisler (1958) study, the 
ords which followed the pauses (these words 
had the highest information, as found in the 
Same study) had a mean length of 7.1 letters 
per word. The mean length of words preceding 
Pauses (these words had the highest redun- 
dancy) was 2.7 letters. The relation between 
ngth and transition uncertainty overall was 
very marked (Goldman-Eisler, 1958, Figure 4, 
p. 104). 


Articulatory Complexity 


Consonants are significantly more often 
uttered than are vowels, and the consonant- 
l difference is far greater than any other 
or, suggesting that it might involve an 
dditional mechanism. 
Hahn (1942) attempted to examine con- 
Sonant rankings in stuttering as a function of 
different manners, positions, and types of 
articulation, without finding any consistent 
trend. The same lack of consistency is found 
in Taylor’s (1966) data. It may be that 
articulatory complexities within consonants 
Or vowels are too idiosyncratic to cause con- 
sistent differential stuttering difficulties, 
Whereas between consonants and vowels they 
ate large enough to bring about such a dif- 
erence, 
According to Miller (1963), “Consonants 
and vowels tend to alternate because the 
‘owel gives the articulators time to get ready 
‘or the next consonant [p. 26].” We cannot 
verse this relationship between consonants 
and vowels by saying consonants give articu- 


ators time to get ready for the next vowels, 
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for we can easily pronounce 
as we like in a row while this is sot 


certain limits are set in any 
example, in a syllabary of Japanese, con- 
sonants exist only in combination with vowels 
(eg. ka, bi, ete.), while vowels have an in- 
dependent 


Words starting with consonants rather 
with vowels, words at earlier rather than 
positions in sentences, and longer rather 


s 

af 

§ 

i 
SEE 


if 
Hf 


3 
] 
3 
i 
g 
i 
g 
i 
„= 5 


structural constraints 
manner similar to stuttering. 
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Studies reviewed in this article are grouped under the areas of response 
problems, dynamic determinants, individual differences, clinical studies, re- 
duction of movement, and theories of autokinesis (AK). Much of the work 
to date is concerned with the demonstration of various “suggestion effects” 
without regard to the basis of residual AK. Determinants of AK are many and 
varied but little can be said about their relative potencies. Although a modified 
version of the Gregory-Zangwill model may serve well, there is presently no 
single theory of AK which accounts for all the data. Further developments 
in the theory and control of AK hinge upon the sedulous development of 
improved techniques for measuring AK. Three criteria for measuring AK are 


offered. 


In 1799 the astronomer Von Humboldt 
made the first report of autokinetic move- 
ment, the illusory movement of a stationary 
point of light in the dark. The illusion was 
first recognized by Sweizer in 1858 and given 
its name by Aubert in 1887. Despite scien- 
tiñc study of autokinesis since 1886 when 
Charpentier made laboratory observations, 
there is little basic knowledge of the causes 
of autokinesis, For example, until the past 
decade, the primary method of measuring 
the extent of autokinetic movement was verbal 
report, a procedure which has not led to 


precise control or understanding of the phe- 
nomenon. 


RESPONSE PROBLEMS 


An ideal record of autokinetic movement 
Would provide valid measurement of the pat- 
tern, length, latency, direction, and instan- 
taneous rate of movement. A method for 
achieving such an ideal record would give 
stable results, yet be sensitive and flexible. 
Since autokinesis is a subjective phenomenon 
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occurring in the dark, the above specifications 
are not easily met. 


Measurement of Eye Movements 


One of the earliest attempts to quantify 
autokinetic movement was that of Guilford 
and Dallenbach (1928). They studied the re- 
lationship between eye movements and auto- 
kinesis by photographing the corneal reflection 
with a panoramic camera. This apparatus is 
reported to record a 1° deviation of fixation 
with a .7mm, deflection of the light from the 
cornea, but they discovered no relationship 
between autokinesis and eye movement. The 
Guilford-Dallenbach experiment was criticised 
by Skolnick (1940) because only horizontal 
eye movements had been recorded while much 
of the movement is vertical. Skolnick obtained 
some evidence that eye movements are neces- 
sary for the initiation of autokinetic move- 
ment, He contended that the method of direct 
observation is more sensitive than the photo- 
graphic technique. 

Further argument against eye movement as 
a measure of autokinetic motion was offered 
by Gregory (1959). With a small red circular 
filter placed in line with the center of a 2-inch 
blue filter, Gregory was able to obtain auto- 
kinetic motion with the red stimulus. If the 
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subject lost fixation, the red target was “sur- 
rounded by a bluish-white halo [p. 114].” 
None of the 50 subjects tested observed the 
bluish-white halo during movement, while all 
observed it when asked to fixate a displaced 
point. Gregory did not report the distance 
between the subject and the stimulus light, 
nor the visual angle subtended by the blue 
filter, Therefore, the relevance of his experi- 
ment to the eye-movement issue is not estab- 
lished. Lehman (1965), using an ophthalmo- 
graph for the recording of eye movements, 
showed a relationship between relatively 
“large” eye movements and the initiation of 
autokinesis, but large eye movements were 
evident even when no movement was reported, 
An improved measure of eye movement is 
needed in work on autokinesis, Such a meas- 
ure would record movement of at least 1°, 
would not be invalidated by head movements, 
and would record in two dimensions. In view 
of the considerable question over the role of 
eye movements in autokinesis, it is doubtful 
that an eye-movement measure of autokinesis 
would ever suffice (see Theory section). 


Judgment Procedures 

Studies using estimation or judgment pro- 
cedures differ only in the type of comparative 
judgments expected of the subject. Carr 
(1910) reported movements in terms of de- 
grees of visual angle. Some investigators have 


AK 
MOVEABLE WALL 
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Fic. 1. Diagram of half-viewing apparatus showing 
changing positions of moveable wall with respect to 
the subject (S). (B is aligned with and adjacent to 
the back of autokinetic, AK, box which is 39 inches 
long. The subject’s left eye views the autokinetic box 
and his right eye views a checkerboard on the move- 
able wall and side of the autokinetic box.) 
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asked the subject to provide a verbal report 
of the autokinetic experience (Hoffman, 
Swander, Barron, & Rohrer, 1953; Luchins, 
1954a, 1954b, 1954c), while others have pro 
vided the subject with a measure (Graybiel & 
Clark, 1945) or coordinate system (Battersby, 
Kahn, Pollack, & Bender, 1956) presented 
after illumination of the room. 

Judgment or estimation procedures may in- 
volve serious errors of judgment and/or of 
memory for the extent of the illusion, Miller 
and Graybiel (1962) have attempted to sys 
tematize the use of verbal reports of move- 
ment by having the subject give a running 
commentary of all movements «ud tape re 
cording them. Prior to the expr imental ses- 
sion, the subject was required :; commit to 
memory an elaborate grid in terms of which 
he reported movement in a dark room. Fol- 
lowing the experimental session, the subject 
assisted the experimenter in ir terpretation of 
the playback and the plotting of movement on 
the grid. Miller and Graybie! reported high 
reliabilities (r =.97 for normals) between the 
measurements in different conditions of the 
experiment, Luchins and Luchins (1963) de- 
vised a “half-viewing” technique which allows 
measurement of autokinetic movement by 
immediate reference to a fully lighted checker- 
board system (see Figure 1). The technique 
involves a marked change from the usual auto- 
kinetic situation, In comparison to normal 
autokinetic viewing, the half-viewing tech- 
nique probably increases the amount of eye 
strain and eye movement. Thus the technique 
may give rise to a change in the amount of 
autokinesis experienced (see Theory section). 

Cormack (1963) has recently studied what 
Carr (1910) referred to as Type II auto- 
kinetic viewing. In Type I autokinesis, fix 
tion remains with a light which seems t0 
move. In Type II viewing, the point of fixa- 
tion seems to remain stationary but the light 
appears to move. Cormack had subjects point 
to the starting point with a flashlight pointet 
when the room lights were turned on. 
method does not provide a measure of total 
distance travelled since there can be no indi- 
cation of deviation and returns of the light. 
Cormack also had the subjects trace the ust 
movement on paper. The reliability data 1°- 
ported for latency in both the Type I and 


‘Type I viewing situations was high (r =.97). 
i jlity for extent in Type H viewing be- 
ore and after a 1-week period was also high 
=.16). Comparisons between the extent of 
Type I and Type II movement, however, were 
lower (r =.34). 

Reported reliabilities for judgment proce- 
ures are usually high, but their validity is 
questionable. The method does not ensure 
“that the subject will attend to movement 
throughout autokinesis and, with the excep- 
tion of Cormack’s method of measuring Type 
“Ii autokinesis, makes heavy demands upon 
“memory and judgment, Subjects may simply 
learn to repeat a previous judgment under 
such conditions. 


Tracing and Tracking 


| Guilford and Dallenbach (1928) had sub- 

jects draw the path of movement. Voth (1941) 
had subjects draw movement and indicate 
complete stops during the movement period 
by a heavy dot. An automated version of the 
pencil-tracing method has been suggested by 
Newbrough and Beck (1962). The subject’s 
: wrist is strapped to the recording apparatus 
and he traces observed movement by indent- 
ing an 8-inch soft foil recording field divided 
into four quadrants. Each quadrant is con- 
nected to a chronoscope which records the 
time spent in that quadrant. The small train- 
ing field used would seem to attenuate the 
validity of length of movement and deviation 
Measures obtained from the trace in the soft 
foil, and the time in each quadrant measure is 
father gross. Bridges and Bitterman (1954) 
Pointed out that the tracing method intro- 
duced errors which reflected the inability of 
lntrained subjects to reproduce a line of a 
given length when there is no visual feedback 
ftom the trace. 

The essential distinction to be made be- 
tween tracing and tracking procedures is that 
although tracing procedures permit tracing 
during movement, they do not allow the sub- 
Jct to “follow” the movement of the auto- 
Kinetic light, The subject is reproducing the 
Perceived movement in a plane which is not 
Parallel to the autokinetic motion. In track- 
mg procedures, the subject can keep the 
tracking instrument “lined up” with the mov- 
ing light, The first attempt to track autoki- 
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netic motion was that of Bridges and Bitter- 
man (1954), They constructed an off-center 
pivoted lever apparatus which the subject 
could move and which contained the stimulus 
light at the other end. The subject was re 
quired to compensate for observed movement 
of the autokinetic light by moving the stimu- 
lus unit in a direction opposite to the move- 
ment. Direction and extent were recorded on 
a constant-speed kymograph. The record ob- 
tained was not a measure of autokinetic mo- 
tion, but was instead a measure of the judged 
distance from some point in the movement to 
what the subject believed to be the original 
starting point. Conklin (1955) pointed out 
that the procedure introduces errors due to 
the interaction of real and illusory movement. 

The problem of providing an apparatus for 
tracking illusory movement was taken up by 
Conklin (1956, 1957). He used an oscillo- 
scope for the stimulus unit to furnish either 
real movement of a stimulus light or a sta- 
tionary stimulus light. The tracking unit con- 
sisted of a gun-fire control sighting station 
with a head rest. First a subject tracked vary- 
ing rates of real movement under conditions 
of dim illumination, Practice in tracking real 
movement on the oscilloscope was continued 
under conditions of immediate knowledge of 


results until the subject capably tracked the 
Conklin was able 


prior observation of real movement seems to 
act as a powerful suggestion which affects 
later trials with autokinetic movement (see 
Suggestion section). 

Royce, Stayton, and Kinkade (1962 ) have 
provided a tracking situation in w the 
tracking unit felt like it was “on” the stimu- 
lus light. The stimulus was presented via a 
system of mirrors so that the subject was 
able to view the light directly in front of him 
and in front of the tracking surface. An ad- 
justable headrest was provided to minimize 
head movements. When movement was per- 
ceived, the subject was to maintain the track- 
ing pencil directly on the 
Measures of direction, extent, and rate were 
obtained directly from the trace, and latency 
was measured by means of a timer deactivated 
by the subject when movement was first 


tracing 

associated with it. Chief among these is the 
possibility that both the illusion and indicator 
responses may vanish as the operations per- 
formed in measuring autokinesis become more 
similar to those involved in veridical percepts. 
That is, tracking methods are, in general, sub- 
ject to the serious criticism that the subject 
may become aware of the discrepancy between 
where he is looking and the highly reliable 


judgmental factors by using the covariance of 


treme stimulus conditions and with extremes 
of autokinetic movement? If tracing or track- 
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ing is used, a sufficiently large drawing 
face is essential for extreme cases. Yet 
presence of the drawing surface should 
tend to inhibit the autokinetic effect 
should allow the use of sizeable autok 
displays placed directly in front of the 
ject. It seems wise, despite a probable 

loss of tracing accuracy, to place the d 

surface at the subject’s side. This would 

nate the possibility of tracking since the autis 
kinetic dot could never be “lined up” with 

tracing pencil, but lateral placement of 
tracing surface would presumably mean U 
there could be no tendency for a subject! 
suddenly become aware of a discrepancy af 
ing between visual fixation and the kines! 
sense of the position to which he has tra 
If the subject’s tracing arm were cuffed by 

opaque cloth, there would be room to i 
mittently photograph a transparent trad 
surface grid and a chronoscope by means off 
wide-angle motion camera. Sharp focus of 

chronoscope could be achieved by coordi 


Precise control of the rate of film expo 
would make the strobotac and chronosco 
unnecessary in many studies. The film ree 
should provide information for any two 
mensional measure of autokinesis, Infor 
tion could be conveniently retrieved by trac 
ing over the projected film record with 
X-Y transducer having output to a comp 
punch-card system. 


Dynamic DETERMINANTS 


Study of the role of higher-order variat 
began with Sherif (1935). It shortly led 
attempts to control the illusion through learn 
ing and conditioning procedures. This type ® 
investigation in turn produced studies of mol 
vational determinants. 


Suggestion 


Sherif’s experiments concerned what B® 
called “social suggestion.” A naive subje 
observed the illusion in isolation but maé 
verbal report of movement in the compat 
a confederate of the experimenter. Subj 
tended to adjust reports in the direction 
the confederate. Furthermore, adjustment 
sisted over several sessions, In Bovard’s 
(1948), adjustment persisted for at leas 
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year. In another type of experiment done by 
Sherif, the subject was informed that the light 
wield move in a certain direction, Subjects 
agaia reported movement in the suggested di- 
action. Haggard and Rose (1944) were later 
wie to modify movement by instructing the 
gbjects that the light would move “most” or 
aly “some” of the time. 

Throughout his research, Sherif attributes 
wtokinesis to events occurring within the 
perceptual field because of an objective frame 
al reference in the subject’s perceptual expe- 
tence which is reflected by decreasing varia- 
bility of reports of extent. The percept frame 
i said to be structured by social suggestion. 
Harriet Linton (1954) was the first to argue 
that Sherif manipulated the response scale 
rather than autokinetic percepts per se. 

Prior observation of an actually moving 
fight is extremely powerful as a suggestion 
which influences extent of an immediately 
flowing autokinesis (Blumenthal, 1961; 
Hofman et al., 1953). Yet Rohrer, Baron, 
Hoffman, and Swander (1954) proposed that 
scal suggestion is more powerful than move- 
ment suggestion, at least for long-term effects. 

had groups of subjects observe real 
movement of either short or long extent. Then 
all subjects were placed in the Sherif social- 
Magestion situation. Reports of movement 
Were still at the social norm and showed no 
Slluence from the movement norm. Unhap- 
Ply, no control was run with social suggestion 
and movement suggestion second. There- 
, Statements about the relative potencies of 
ovement and social suggestion are not war- 
tanted, 
Several studies have been concerned with 
effect of expectancy upon movement. 
Carr (1910) reported thať volition usually 

“ects movement, but only after a period of 
"me and when accompanied by emphatic 
Smet Comalli, Werner, and Wapner 

) used pairs of running horses, arrows, 
iss tunning boys, with the directional dynam- 

Teversed within pairs. They obtained an 

=t of directional dynamics. They also ob- 

the effect with ambiguous figures; for 

le, an object called parachute tended to 

the ra, but if the experimenter called 
€ object a balloon it tended to move 

d. Comalli (1960) again obtained the 


ment. The interaction between degree of 
stimulus-dynamism (static, dynamic) and 
professional status (artist, scientist) was sig- 


tingencies were the r 

menter who informed re of the “accu- 
» of his judgments after 

pros The extent to which subjects changed 

their judgments in a later session in the pres- 

ence of a confederate subject (who judged 


“failure” up had 

the iam group significantly lower, sug: 
tibility scores than the “no reinforcement 

and the “ambiguous reinforcement” groups. 
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Without similar intentions, but with similar 
results, Rectshaffen and Mednick (1955), and 
Mednick, Harwood, and Wertheim (1957) in- 
vented the “autokinetic technique.” Subjects 
were told that the moving light would spell 
out words and were praised when they saw 
them. Word production increased through the 
session. This research is a good example of 
what is now called “verbal conditioning” (v. 
Krasner, 1958, for a complete review). Far- 
row and Santos (1962) successfully changed 
the direction of movement by exposing sub- 
jects to shock during a session of real move- 
ment when the light was on the side which was 
initially preferred during autokinetic move- 
ment. A control group which received no shock 
was also used. 

The previously cited study of Kelman 
(1950) was the first autokinetic study with 
interpretations in terms of motivational fac- 
tors. His subjects were given the Guilford- 
Martin inventory of personality factors, Scores 
on ascendency, self-confidence, and lack of 


to explain. 

Jakubczak and Walters (1959) studied 
autokinetic suggestibility as “dependency be- 
havior.” It was found that the degree of in- 
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bility in subject’s responses 

Variability in subject’s reports was m -a 
creased by discrediting the person giving e 
suggestions, thereby presumably lowering hi 
prestige, 

These observations clearly imply a sdi 
conformance motive (Rohrer et al., 1954) i 
subjects who participate in autokinetic exper- 
ments, An empirical determination of the m 
lationship of physiological indices of diw 
(both variability and magnitude) to the x 
cial-conformance motive could be elucidating 
but has not yet been carried out in conmee 
tion with autokinesis. 

In the autokinetic situation there are ai 
least two kinds of theoretical processes: oa 
rect or incorrect responses associated with the 
presence or absence of the phenomenal ile 
sion. The question is whether the perception 
is modified or whether subjects merely trani- 
form their responses, regardless of 
events, so as to conform to suggestion, pait 
experience, or motivational incentive. Sud 
problems in perception research were recently 
analyzed by Garner, Hake, and Erick 
(1956). If expectancy, learning, or utility 
determine percepts in the above au 
situations, then the perceptual consequences of 
those events must be shown to hold constat 
over various independent response or 
ment media, To date, this has not been 9* 
tematically carried out for the au 
illusion. 


INDIVIDUAL DIFFERENCES | 


In view of the problems which arise if | 
measuring autokinesis, it is not too su 
that individual differences studied thus fsf 
are not very reliable. Within this 
limitation, sex is the most reliable variable t% 
date. Fisher (1962) found that, beginning at 
age 11, males were significantly more right- 
directional for autokinetic movement 
females. He suggested that the difference M3Y 
be caused by differences in right versus 
muscle-tonus levels, Both Newbigging (1954) 
and Chaplin (1955) found a tendency for fè; 
males to make smaller estimates of extent 
movement during their first trials, followed bY 
a shift to larger estimates after facili 
suggestion. McKitrick (1965) found long 
latencies for females, Elfner and 


n a_a, TTT ST 
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(i965a), using a flickering light, found a 
shorter latency for females. This effect 
disppeared when practiced subjects were 
wed (Eliner & Page, 1963b), They suggested 
that the initial diference may have been due 
to a greater responsiveness of the females to 
the instructions at the beginning of testing. 
The implication is that sex interacts with 
the degree of suggestion in determining auto- 
kinesis, but more work is needed to estab- 
teh this point. 
Voth’s (1947) major goal was to correlate 
pattern differences in tracings with person- 
ality syndromes. His efforts with 600 subjects 
were largely unsuccessful, although it did ap- 
pear that introversion was associated with 
greater extent of movement. High intrasubject 
reliability (.96) was found for length indices. 
A J-curve was obtained with large samples of 
extent judgments, indicating a reality orienta- 
tion with tendency to minimize the illusion. 
Jakubczak and Walters (1959) studied 
autokinetic suggestibility as related to “de- 
pendency behavior” in 9-year-old boys. De- 
pendency was defined as willingness to accept 
help (Kreshner, 1957). A dependent group 
was found to be significantly more influenced 
than an independent group, but only in the 
first of two experiments. Young and Gaier 
(1953) found an inverse relationship between 
Mutokinetic suggestibility and self-sufficiency. 
Sexton (1945) found that a prominence in 
the Rorschach movement factor predicted 
ge reported autokinetic movement, but 
Steisel (1952) was unsuccessful in his at- 
tempt to relate Rorschach protocols to re- 
Ported extent measurements in a suggestion 
periment, 
In view of the number of nonsignificant and 
Conflicting results recorded to date in the area 
individual differences, it would be unwise 
t0 embark on further fragmentary studies of 
Sort reviewed here. The authors believe 
t any significant progress in this field 
ts serious multidimensional and multi- 
studies incorporating detailed operational 


nitions, 
CLINICAL STUDIES 


Of 50 diagnosed schizophrenia patients 
by Sexton (1945), 41 reported large 


groups has been studied by 
Mettler, and Kline (1949) and Kline (1982, 
1956). Heath et al. reported a study in which 

tested 


following lobotomy or topectomy operations, 
The study is also of interest because of 
increased mobility attributed to the effects of 
post-lobotomy 
patients, none reported movement when tested 


months, three of five patients recorded 
average movement of only 4.1 
the 12 post-topectomy patients 
ment of an average of 28.3 
the total movement observed 
operative patients was shorter 
plex than that recorded by the control 
Kline’s study (1952) involved both 
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psychoneurotics indicated appreciably more 
movement, 

The clinical literature associated with the 
autokinetic phenomenon is conflicting. Two 
major sources of the reported discrepancies 
can be attributed to differences in population 
sampled and differences in the type of index 
used in obtaining a score from the raw data. 
It is clear that future studies of autokinetic 
motion in abnormal populations should report 
different measures separately and give rele- 
vant details operationally defining the popu- 
lation sampled. 


REDUCTION OF MOVEMENT 


Aside from its theoretical implications, ef- 
fective reduction of autokinetic movement 
would be of value in certain applied areas, 
such as night combat, flying, and night driv- 
ing. 


Display Factors 


Several studies have demonstrated increased 
latency and/or decreased extent of movement 
with an increase in the intensity of the light 
from background or from patterns peripheral 
to the autokinetic dot (Edwards, 1954; Kar- 
woski, Redner, & Wood, 1948; Luchins, 
1954b; Miller & Graybiel, 1963; Royce, Stay- 
ton, & Kinkade, 1962). Nonetheless, Edwards 
reported some movement with stimuli of 174 
footlamberts. Royce et al. obtained no signifi- 
cant reduction when the intensity of the auto- 
kinetic dot was increased. Uchiyama and 
Keiichiro (1963) have devised a new tech- 
nique for measuring what they call “field 
forces” of visually-perceived figures. Their 
index of field force is essentially an index of 
autokinetic movement. A test-point subtend- 
ing 34-minute visual angle was presented for 1 
second between the figure and a scale-point of 
10-minute visual angle, the luminous intensity 
of which was variable. Under these conditions, 
the test point usually appeared to move 
swiftly away from the edge of a contour ona 
solid figure when it appeared and back to it 
as it disappeared. The movement could be 
counteracted by increasing the luminous in- 
tensity of the scale point. It should be noted 
that in all studies of the effect of intensity 
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there may be confounding with change is the 
spectral distribution of light. As the i 

of light is varied by changing the voltage 
applied to the tungsten filaments used, w 
appreciable spectral shift occurs. Such a si 
could be eliminated by using neutral 
filters to control luminance. 

The effect of changing the size of the aut» 
kinetic light pattern, invariably confounded 
to date with an increase in the total illumine: 
tion or irradiance, has been the most 
oughly studied of all factors tending to reduce 
movement. The general finding when eitha 
increasing the number of lights or the size a 
a single light is that the latency of aul 
kinesis increases and its extent decreases (Bd 
wards, 1954a; Graybiel & Clark, 1945; Ludi 
ins, 1954c; Royce et al., 1962). Autokineši 
may still occur with very large stimuli and i 
is difficult to eliminate completely (Edwards 
1954a, 1954b; Graybiel & Clark, 1945; 
woski et al., 1948). Karwoski et al. attributed 
earlier reports of negligible movement will 
large stimuli to the closeness of the stimulis. 
to the subject. This interpretation is open 
direct empirical test at distances from whit 
cues from accommodation and convergen 
are not operative. Royce et al. (1962), 
though finding a reduction in movement whee 
a ł-inch concentric band was placed aroun 
the autokinetic dot, found no variation M 
reduction effect as the radius of the band wa 
varied. But, in this experiment, the dimensio® 
of circle size was confounded with the dimet 
sions of distance from band to autokinet® 
dot. Using their previously described t 
nique for measuring field forces, chive 
and Keiichiro (1963) noted a decrease ™ 
autokinetic movement as the distance betwee? 
figure and test point increased. With oa 
subjects, Elfner and Page (1963a, 19630) 
found that a dot flickering 10 cps above OF 
low fusion gave greater movement and s orter 
latency than did a fused source. This be 
true despite the tendency for the light to ® 
perceived as brighter when flickering. It 
therefore necessary to distinguish betw® 
brightness and intensity as factors determin- 
ing movement. 7 

A number of experimenters have studied the 
effect of having the subject inspect a fix4 
figure which is removed just before expost 
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w the autokinetic dot. Results have been 
marginal. After failing in their first experi- 
ment, Crutchfield and Edwards (1949; Ed- 
wards & Crutchfield, 1951) reduced the 
period of preexposure and were able to obtain 
grater reduction in movement when a semi- 
drcle was preexposed in the path of the usual 
direction of movement than when preexposed 
on the other side of the dot, These authors 
asumed that long preexposures impeded 
autokinetic movement over an extensive area 
of the retina. They did not comment on the 
likelihood that a negative afterimage, how- 
ever achieved, might furnish a stabilizing 
frame of reference. Conklin (1957) used a 
number of experimental conditions designed 
to test for induced cortical satiation. In two 
experiments, it was found that inspection fig- 
ures varying in size and shape, filled and un- 
filled, and varying in orientation in the visual 
field, did not significantly affect the direction, 
latency, rate, or displacement of autokinetic 
movement. Livson (1953) was able to reduce 
autokinetic movement by using prior inspec- 
tion of beta movement. The reduction ex- 
ceeded that obtained with control groups when 
the beta-movement lights were flashed at rates 
of alternation giving no beta movement. 

In line with their theory that changes in 
“ganismic state are mirrored by perceptual 
changes, Miller, Werner, and Wapner (1958) 
Predicted and obtained a significantly higher 
Proportion of upward motion for an ascending 
than a descending tone. A study by Glick, 
Wapner, and Werner (1965) confirmed this 

ung and added the fact that subjects felt 
è if their bodies were also moving up or 

with the tones. Apparently the dynamics 

, “Sponsible for autokinesis and body move- 
Ment leave a negative aftereffect. Subjects 
fave evidence that a reference line was dis- 

: downward after a session with ascend- 
mg tone, Soloyanis and Corso (1953) found 

t the extent of movement depended upon 

Magnitude of the bilateral intensity dif- 
of binaural stimulation. Movement 
| toward the side of greater auditory 

tion. They also found a reduction 

Sct as the number of trials increased. 
Sani their previously discussed novel 
l Es of “half-viewing,” Luchins and 

(1963) discovered a number of con- 
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ditions which reduce autokinetic movement. 
Generally, there was little latency of move- 
ment and the movement was more continuous 
with the half-viewing condition than when 
only one eye was used. As the light viewed by 
the “outside eye” became more diffuse 
nonarticulated, there was a decrement in 
amount of movement reported. If the 
ery of both eyes viewed a lighted 
dot remained stable. With presentat 
multiple dots, there was a general 
for the dots to move as a group, Upon ini 
exposure, both multiple and single 

moved toward the illuminated view. 
angle of regard decreased, with the result 
the dot became closer to the “inside 


tek 


$ 
2 
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would expect, it was found that movement 
decreased as an illuminated wall was brought 
closer to the subject’s outside-viewing eye. 
With two boxes, one for each eye, when the 
externalized light was cast on the end dot of 
a row to form an “and summation” pattern, it 
moved with the end dot and the remaining 
dots were stationary. On the other hand, when 
cast into a square of dots to form a “Gestalt 
frame” pattern it became stationary, sug- 
gesting greater effect for an “intensive unity” 
or Gestalt frame than for an “and-summa- 
tion” frame. This effect may be related to 
Green’s (1961) important findings on the 
function of regular configurations in promot- 
ing figure coherence in the kinetic depth ef- 
fect. On the other hand, Edwards (1959) had 
measured duration of movement for patterned 
matrices of dots and randomly arranged dots 
and found little difference. The exact relation- 
ship of “half-viewing” to autokinesis in the 
usual setting is not established. The procedure 
in “half-viewing” would seem to foster more 
eye movement or more eye strain than does 
the usual measurement setting, but data is 
required to settle this matter. 
i. agreement with earlier reports by Carr 
(1910), Luchins (1954a) has reported that 
peripheral viewing gives much more rapid 
rates of movement than central viewing. 
Movement while viewing from the periphery 
of the eye was also smoother and more pre- 
dictable in direction. 
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Subject Factors 


The effect of sensory deficits has been ex- 
plored in two studies, Battersby, Kohn, Pol- 
lack, and Bender (1956) found loss of the 
autokinetic response to rotation with vestibu- 
lar dysfunction, but little effect from somato- 
sensory deficit. With visual deficit, autokinesis 
occurred mainly toward the normal perimeters 
of the eye but could be reversed by turning 
the body in the appropriate direction. Using 
a group of labyrinthine defectives, Miller and 
Graybiel (1962) have confirmed an earlier 
finding (Graybiel & Niven, 1956) that the 
sensory organs of the inner ear are not essen- 
tial for the perception of autokinetic move- 
ment. They did find a significantly greater 
amount of angular movement for the defec- 
tive group when compared with normals and 
suggested that the defect contributed to an 
inability of the defectives to stabilize the tar- 
get in space. Whether subjects were recum- 
bent or sitting did not significantly affect the 
extent of the movement. 

Honigfeld (1961) embedded extent and 
latency measures of autokinesis in a battery 
of 48 perceptual and intellectual measures. 
Except for a —.39 correlation of extent with 
latency, no measures in the battery correlated 
appreciably with the autokinetic measures, 
McKitrick (1965) found little relationship 
between autokinesis and body activity, re- 
versible figures, and visual figural aftereffects, 
Anderson (1965), however, found a tendency 
for cross-modal positive correlation between 


various measures of auditory and visual auto- 
kinesis. 


Miscellaneous Factors 


Farrow, Santos, Haines, and Solley (1965 ) 
found that latency of autokinesis decreased 
with practice under both massed and spaced 
conditions of practice. Extent measures in- 
creased under massed practice only. 

Singh and Singh (1961) administered stimu- 
lant and depressant drugs and used a placebo- 
control treatment. They found a significantly 
decreased latency for the excitant drug. These 
effects were interpreted by those authors as 
due to the effect of the drugs upon the rate of 
cortical satiation. This interpretation is sub- 

ject to testing through measurement of fig- 
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ural aftereffects as a function of the drug, We 
have found no studies of autokinetic move. 
ment as a function of any physiological meas. 
ures of arousal. 


THEORIES OF AUTOKINETIC MOVEMENT 


A wide variety of theories have been ad- 
vanced to explain the illusory phenomenon. 
They are based variously on eye movements, 
strain in the muscles surrounding the eye, the 
sensory-tonic field, the streaming of retinal 
fluids, figural aftereffects, and other causes 
which are not so readily classifiable. 


Eye-Movement Theories 


Hoppe (1894), first proponent of an eye- 
movement theory, suggested that the illusion 
is due to involuntary eye movements of which 
the subject is unaware. These movements 
cause the subject to mistakenly attribute 
movement to the object. The theory has never 
been widely accepted. Guilford and Dallen- 
bach (1928) reviewed the evidence contra- 
dicting it. Skolnick (1940) believed much of 
the evidence marshalled by Guilford and Dal- 
lenbach was invalid. His subjects were unable 
to distinguish ordinary autokinetic movement 
from that induced by rotational and caloric 
nystagmus, Further, by direct observation of 
the subject’s eyes, Skolnick was able to tè- 
liably predict the direction of autokinetic 
movement, Lehman (1965) found that eye 
movements were related to the onset of auto- 
kinetic movement and to changes in its di- 
rection, The results of Matin and MacKinnon 
(1964) indicate that horizontal stabilization 
of the retinal image greatly reduces the fre- 
quency of horizontal autokinetic movements: 
Matin and MacKinnon related this effect t0 
the work of Hubel and Wiesel (1959, 1962) 
on individual neurons of the cat’s cortex whi 
are differentially sensitive to different diret- 
tions of movement. These recent studies indi- 
cate that eye movements may indeed play 
some role in autokinetic movement, but more 
precise measurements relating the direction 3 
eye movements to the direction of autokines 
are needed, 


Muscle-Strain Theories 


If one stands with his side close to 4 ee 
presses his arm against the wall for a #™ 
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and then steps away from the wall and re- 
laxes, the arm will rise involuntarily due to 
the myotatic stretch reflex. This phenomenon 
of movement in the direction of muscle strain 
has been used by several writers in attempts 
to understand the autokinetic phenomenon, 
Charpentier (1886) attributed the illusory 
movement to strains arising in extrinsic eye 
muscles during fixation. Strain sensations 
have been associated in the past with environ- 
mental movement, therefore such sensations 
can be interpreted as due to a moving stimu- 
jus in the autokinetic situation, Charpentier 
reasoned. Since the head is sometimes turned 
in following movement, the same argument 
should apply to neck muscles. Bjorkman and 
Gothlin (1930) extended the theory of muscle 
strain to account for foveal as well as periph- 
eral autokinetic movement. 

Adams (1912) studied the effect of position 
of the eye in its socket upon the direction of 
autokinetic movement. He found that the 
tendency for the point to move in a given di- 
tection increased as muscular tension in- 
creased in the same direction. Davis (1952) 
in effect replicated Adams’ research. His re- 
sults supported the contention that muscular 
fatigue facilitates autokinetic movement. 
Crovitz (1962) used monocular viewing to 
stimulate the eye muscles with movement, 
thus causing strain in the direction of the 
viewing eye. The direction of initial auto- 
kinetic movement tended toward the eye stim- 
ulated for every subject. Similarly, Gregory 
and Zangwill (1963), though attributing the 
origin of autokinesis to minor fluctuations of 
4 central neural system monitoring efferent 
lmpulses in control of eye movements, have 
pied pronounced and immediate auto- 

mesis, usually in the opposite direction how- 
a after induced eye-muscle strain. They 
mA end similar, though less marked, re- 
AE ollowing induced tension of neck mus- 
eg showing movement in the direction 
i to induced muscle strain may seem 
die hand Charpentier’s analysis. They do 
& ee It is assumed that “sensed” muscle 
Pein relative to the adaptation level of the 
the aia, after a period of straining. Thus, 
A ained arm not only rises but does so 

untarily and rather imperceptibly unless 
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one attends visually to the arm. If the strained 
arm is deliberately kept from rising (or, 
analogously, if one keeps the autokinetic dot 
fixated) there is an illusion of strain in the 
opposite direction. According to Charpentier, 
the result should be movement in the direc- 
tion opposite to that of the original induced 
eye-muscle or neck-muscle strain. 


Sensory-Tonic Field Theory 


Werner and Wapner (1949, 1952) hold that 
sensory and tonic factors are dynamically 
equivalent and interact; muscular tonus at 
some point in the body can influence sensory 
processes. They postulate that a certain 
amount of sensory-tonic energy is available; 
this “energy may either be released through 
body movement or may express itself in per- 
ceptual displacement [1949].” Just what the 
“sensory-tonic energy” is and where it comes 
from were never clearly discussed. 

Goldman (1953) tested the sensory-tonic 
hypothesis that inhibition of motor expression 
would tend to increase the duration and com- 
plexity of autokinesis and decrease its latency. 
Three levels of motor activity were used, 
ranging from complete immobilization to con- 
tinuous movement of the arms. The results 
supported the sensory-tonic hypothesis. Nag- 
atsuka (1960) investigated the effects of sev- 
eral bodily conditions on autokinetic move- 
ment. He found that the amount of movement 
was greater when standing than when lying 
down, and that movement occurred in the 
direction of head rotation, but in a direction 
opposite to the side on which a weight was 
lifted, He interpreted these results as con- 
sistent with the sensory-tonic theory. How- 
ever, according to the sensory-tonic field 
theorists, there should be less motor involve- 
ment and thus more “perceptual activity” 
when lying than when standing. Therefore, it 
seems difficult to reconcile Nagatsuka’s find- 
ing with what would be expected from the 
theory. 

Wishner and Shipley (1954) derived the 
assumption that natural handedness is a re- 
flection of tonic tensions within the persor 
from the sensory-tonic field theory, and testec 
the hypothesis that subjects should see mort 
movements toward their dominant side, Thei 
hypothesis was confirmed for right-hande 
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erence, momentary streaming effects should 
appear to involve a large area when subjec- 
tively projected into space and thereby induce 
an illusory effect on small autokinetic dots. 


Sech an assumption bs not at abdi with aip 
tatlon-level theory, for streaming effects ae 
quite possibly relatively unpeedictabhe willie 
the same individual and may therefore lw 
be amimilated into the adaption level, 


Setiation-Based Theories 


For two reasons, several investigators how 
found peripheral theories of autokinetic mimp 
ment unsatisfactory and have called f a 
cortical interpretation. First, the Mammy 
movement is to some degree subject to velur 
tary and social control. Secondly, mangai 
evidence suggests that movement during asir 
kinesis may be influenced by cortical event 
which are extraneous to peripheral ferdbadk 
Crutchfield and Edwards (1959) and Royer 
et al, (1962) found that extent of aw 
movement is influenced by the prior impie 
tion of some object in the \isual field. That 
the reduction in extent of movement can be 
obtained even when the inspection and stat 
quent autokinetic observations are carried oat 
with different eyes suggests—if one dixountt 
the possibility of retinal aftereffects- 
nervous structures higher than the retina at 
in some way involved in the phenomenon. 

The most common cortical interpretatio® 


uses Köhler and Wallach’s (1944) theory of 
satiation. Crutchfield and Edwards (1949) 
and Edwards and Crutchfield (1951) made 


distorted by the accumulated resistance from 
the inspection. Further incoming impulses att 


direction. Lehman (1963), following 
and Heyer (1952), assumed that satiation Ë 
normally distributed on the cortex as 4 
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be did not include a measure of extent among 
Ms dependent variables, using only latency, 
tate, and displacement. 

Overall, then, the prediction that satiation 
thould reduce the extent of autokinetic move- 
ment has support, but the directional predic- 
tien does not. This indicates that the satiation 
theory of autokinesis is not adequate. 


Other Theories 


Several theories of autokinetic movement 
do not fit readily into the previous groups. 
The most common is the frame-of-reference 
type of theory, which states that the light 
Appears to move because there is no reference 

to which it can be related (Sherif, 1935) 
@ because the field is highly structured 
(Edwards, 1959). Sherif’s “theory” is quite 
ory since it fails to give a positive 
‘count of the cause of illusory motion and 
Rot account for the latency period. 
Edwards’ theory was not empirically con- 
when the autokinetic effect of struc- 
feted fields (low in information) was com- 
Pared (Edwards, 1959) with that of a ran- 
patterned field (high information). 
t frame-of-reference theories focusing 
on distal stimulus configurations appear 
uate to account for the illusion. Their 
is increased, however, if the streaming 
on is assumed to furnish a pseudo- 
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Rectus 
MUSCLE 


COMMANO SIGNAL 


TP 
MONITORING 
Cee 
~ eee 


optic 7 
NERVE IMAGE/RETINA 
MOVEMENT SIONAL 


MOVEMENT OF WHOLE 
FIELO WHEN OUT OF BALANCE 


Fro. 2. A diagram of the hypothetical “outflow” eye- 
movement control system suggested by Helmholtz. 


1930). As a final point, the movement of 
afterimages during voluntary eye movement 
is explained since the image does not shift 
across the retina to give an image-retina 
signal to cancel the image of CNS efferent 
discharge. 

The model of Gregory and Zangwill, shown 
in Figure 2, incorporates the automaticity 
espoused by von Holst as an explanation of 
autokinesis. According to the model, the effect 
on perception of movement across the retina 
due to movement of the head or eyes is 
automatically cancelled in the comparator on 
the basis of information provided by the 
command monitoring loop. The origin of auto- 
kinesis is held to be inefficiencies of this 
oculomotor system, mainly due to adaptation 
of the monitoring component. Adaptation of 
the monitoring loop tends to distort the con- 
tinuous flow of signals from the command 
center to the comparator and give rise to 
autokinetic motion. Thus autokinesis may 
occur immediately after experimental opera- 
tions are used to induce adaptation in the 
monitoring loop associated with eye-muscle 
strain. Autokinesis will occur otherwise only 
after a period of latency so that adaptation 
in the monitoring loop may build up as a 
consequence of having to view a small target 
in the dark. 

The Gregory-Zangwill model seems reason- 
ably consistent with the tenuous relationship 
of eye movements and autokinesis, and it 
fits the consistent relationship between eye- 
muscle strain and autokinesis whether or not 
adaptation-level effects are posited for kines- 
thetic feedback from the eye muscle to the 

comparator (not shown in Figure 2). Matin 
and MacKinnon’s (1964) movement-receptor 
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data are encompassed by the model if itè 
assumed that small eye movements which af 
fect retinal input to the comparator may set 
be monitored. On the basis of Gregorys 
(1958) previously discussed and insufficiently 
reported blue-filter technique and evidence 
that prior eye strain causes illusory movement 
in the opposite direction, Gregory and Zang- 
will (1963) have attributed the origin of 
autokinesis mainly to fluctuations in the ef- 
ficiency of the monitoring component rather 
than the eye-muscle component. Their pos 
tion is not satisfactorily established. It is poe 
sible to account for autokinesis and latency 
periods by reference to peripheral strain and 
adaptation-level effects as previously done by 
the present authors in the subsection titled 
Muscle-Strain Theories. There is no evident 
a priori reason why peripheral influence on 
the “comparator” of the Helmholtz model 
should be limited to peripheral input from 
the retina only. 

Gregory and Zangwill acknowledge the 
need to explain why autokinesis is not experi- 
enced in normal daylight viewing when one 
might also expect inefficiency of the monitor- 
ing component. They do not cope with the 
problem of explaining the striking effects of 
suggestion on both extent and direction of 
autokinesis, However amended, the model of 
Helmholtz (1924) would not appear to throw 
much light upon the heavily documented ef- 
fects of expectancy, “satiation,” sensory-t 
interaction, and frame of reference without 
recourse to an explicit theory of the me 
chanics underlying central aspects of percep- 
tion and volitional “command.” One such 
theory is available and is ripening empirically 
(Heckenmueller, 1965), Hebb’s (1963) semi- 
autonomous process would appear to encom 
pass many of the findings in the field of auto- 
kinesis and, in view of its special emphasis 
on relative perceptual autonomy, relates W 
to a basic fact of autokinesis, that is, turning 
the lights on stops autokinetic motion. Before 
a Hebbian interpretation of autokines!s 
be advanced, it is essential to show whe 
the predictions afforded by Hebb’s theory a 
confirmed over the various methods ° 
measurement. According to Hebb (1949), 
sensory, associative, and motor elements are 
integrated in relatively autonomous 4 
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eortical function. Therefore, it would seem 
that qualitative differences in autokinesis 
should not exist between various methods of 
measuring autokinesis unless, as in the case 
of tracking procedures, one method of meas- 
urement provides veridical feedback tending 
to eliminate the illusion itself, 


SUMMARY 


Although autokinesis may at first appear to 
be a simple isolated phenomenon, the experi- 
mental literature reveals that it is multiply 
determined and relevant to a variety of im- 
portant problems and theories of perception. 
Understanding of autokinesis would provide 
us with considerable insight concerning the 
psychology of basic perceptual processes. Un- 
fortunately, little firm progress has been made 
to date, Much of the difficulty stems from 
insufficient attention to response problems. 
Although a variety of indicator responses are 
possible for autokinesis and it is essential to 
assess their operational equivalence and indi- 
vidual merits, serious methodological studies 
of significant scope do not exist in the 
literature. 

It appears that autokinesis has been more 
of a plaything than a serious subject of study. 
Its methodology is embarrassingly naive and 
undisciplined. Clinical studies and studies of 
individual differences may be ized as 
conflicting and ambiguous. Although a num- 
ber of interesting theories of autokinesis have 
been proposed and supported within the 
Serious limitation of poor methodology, no 
attempts to decide empirically between theo- 
ries on the basis of opposed predictions have 
been made. 

In short, autokinesis is an important but 
difficult problem which has not been given 
the concerted study it requires. 
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The measurement of the galvanic skin response 
from many sources. Recent work bas 
the response and has provided an appro 


elucidated the 


siders the measurement and analysis of the GSR in the light of this recent work. 


and the optimal electrode systems are 


and environmental variables which influence the response is Included. 


The galvanic skin response (GSR) is the 
most sensitive physiological indicator of psy- 
chological events available to the psychologist. 
However, the measurement technique is beset 
by numerous sources of error, and it is im- 
portant that these should be recognized if 
consistent results are to be obtained. Ten 
years have elapsed since the last detailed ac- 
count of the subject (Woodworth & Schlosberg, 
1954), In the interim there have been several 
additional contributions, including an elucida- 
tion of the peripheral mechanism (Lader & 
Montagu, 1962). The purpose of this article is 
to present an account of the measurement of 
the GSR in the light of this recent work. 

The GSR, also known as the psycho- 
galvanic reflex (PGR) or electrodermal re- 
sponse (EDR), covers two distinct but related 
Phenomena. The first, discovered by Féré 
(1888), is a drop in the electrical resistance 
of the skin to the passage of an applied cur- 
Tent. The second, the phenomenon of Tarchan- 
off (1890), is a change in the natural potential 

ifference between two areas of the body sur- 
. Both phenomena normally occur together 
Supported by Grant No. MY-3561 from the 
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1958) as part of the 
, but the quantitative relationship 
uncertain. The change in. 


1936) and there is, as yet, no agreemen! 
to the significance or measurement of these 


(Montagu, 


and Sayer (1963) and Venables (1964). 
A distinction is sometimes made between 


vidence of any difference in the ; 
tnechanism. The GSR is merely a rapid change 
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in level occurring in response to a stimulus. 
Consequently, no such distinction is made in 
the present article. 


PERIPHERAL MECHANISM 


Historically, there have been three basic 
theories regarding the peripheral mechanism 
of the GSR: (a) The muscular theory, which 
attributed the electrical changes to muscular 
activity beneath the skin at the site of the 
electrode (Sidis & Nelson, 1910). (b) The 
vascular theory, which maintained that it is 
caused by a change in tone of the blood 
vessels of the skin (Féré, 1888; McDowall, 
1933). (c) The secretory theory, which held 
that it is caused by changes in the activity of 
the sweat glands (Darrow, 1927; Peterson & 
Jung, 1907). 

The muscular theory was soon discarded 
(Waller, 1918). It has been only recently, 
however, that clear evidence has been ad- 
vanced in support of the secretory theory and 
against the vascular theory. Lader and Mon- 
tagu (1962) carried out two series of experi- 
ments in which skin resistance and pulse 
volume were recorded simultaneously from 
the same finger. In one series, atropine was 
introduced locally by iontophoresis in order to 
paralyze the cholinergically innervated sweat 
glands; in the other, bretylium (N-o-bromo- 


Fic. 1. Electrical model of organism in relation to 
the GSR. (Ro= resistance of dry skin, R; = resist- 
ance of body interior, rı — fa = secretory units, C = 
skin capacitance.) 
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benzyl- N -ethyl — NN —dimethylammonium) 
was introduced in a similar manner in order 
to paralyze the adrenergic nerve endings 
governing vasomotor tone. Atropine was found 
to abolish the GSR without affecting vas 
omotor activity. Bretylium, on the other hand, 
abolished vasomotor activity without affect- 
ing the GSR. These results indicate that the 
GSR is mediated solely through the sym- 
pathetic cholinergic nerve supply to the skin, 
and that it is entirely attributable to changes 
in the sweat glands. The changes are probably 
in the nature of a decrease in polarization ac- 
companying an increase in permeability of 
the cell membrane (Gildemeister, 1915). 


ELECTRICAL MODEL 


When two electrodes are placed on the 
intact surface of the body, the resistance be- 
tween them is virtually the sum of the skin 
resistances at the electrode sites. The resist- 
ance of the body interior is negligible in 
comparison. Within the skin, the resistance 
lies largely in the stratum corneum (Lawler, 
Davis, & Griffith, 1960) which acts as an 
insulator over the body surface. However, the 
stratum corneum is perforated by the sweat 
ducts, which offer potentially conducting 
pathways through the barrier depending upon 
the activity of the sweat glands. An increase 
in their activity results in a drop in resist- 
ance, that is, in an increase in conductance 
through the skin (Darrow, 1934). 

Thomas and Korr (1957) showed that the 
conductance varies linearly with the number 
of sweat glands which are actively engaged in 
propelling sweat to the surface. Conductances 
are additive when they are in parallel. Con- 
sequently, the physiological units can be 
regarded as potentially conducting pathways 
in parallel, each pathway being switched in 
when the unit is active. These pathways arè 
represented in the electrical model of Figure 1 
by the resistances 71, ro, 73, . . . Tn-1) Tn» WA 
the sweat glands are naturally inactive 
(Thomas & Korr, 1957), or when their ac- 
tivity has been paralyzed by atropine (Lader 
& Montagu, 1962), there is a small r idual 
conductance. It is therefore necessary t° 
postulate an additional, nonsudorific pathway 
in parallel with the sudorific pathways. This 15 
denoted by Ro in Figure 1. The resistance Rx 
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in series with all the other resistors represents 
the resistance of the body interior. This will 
be referred to again in connection with 
alternating current measurements. Its value is 
of the order of only a few hundred ohms. 
Consequently, as stated above, it can be 
ignored. In relation to direct currents, there- 
fore, the body may be regarded as a set of 
resistors all in parallel, each of which (except 
R,) may be switched in or out depending 
upon the activity of the sweat glands. 

Although the physiological units are rep- 
resented in this model by resistors, their 
resistance is not in fact a true ohmic property. 
The apparent resistance of the sweat glands 
is attributable to the generation by the ap- 
plied current of a back electromotive force 
(EMF) across the polarized cell membranes 
(Gildemeister, 1928). Since the back EMF 
opposes the flow of the applied current, it is 
resistive. However, it behaves like an ohmic 
resistance only within certain limits. This 
influences the method of measurement and will 
be discussed in the appropriate section. 


Unit OF MEASUREMENT 
Scale of Measurement 


The term GSR is a misnomer (and so are 
the alternatives, PGR and EDR). The drop 
in electrical resistance of the skin which fol- 
lows a stimulus is not, in itself, the response. 
It is a result of the response. The response is 
an increase in sweat-gland activity. The most 
biologically meaningful scale of measurement 
for the GSR, therefore, is that which provides 
‘linear relationship with sweat-gland activity. 

Darrow (1934) first drew attention to the 
fact that the reciprocal of the skin resistance, 
namely, the conductance, varies with the 
amount of perspiration. More recently, Thomas 
and Korr (1957) demonstrated that the con- 
ductance varies linearly with the number of 
active sweat glands. This occurs because the 
Sweat glands are in parallel, and conductances 
0 parallel are additive; resistances in parallel, 
on the other hand, are an inverse function. 
ots therefore, it may be more meth- 

logically convenient to measure the resist- 


‘nce, the values should be reciprocated to 
conductances, 


No OF ACTIVE 

SWEAT GLANOS 

Fro. 2. Nature of relationship between skin con- 
ductance G (solid line) and number of active sweat 


glands. (The broken line illustrates approximate aa- 
of relationship between admittance F at 60 cps 
and sweat-gland activity.) 


The 
Korr 
G 


demonstrated by Thomas 
between skin conductance 
activity is oe a in 
expressed by equa- 

is the number 


(1957) 
sweat-gland 
2. It may be 
= Nm + Go, in which N 


and 
and 
Figure 


activity has been paralyzed by atropine. 
Under these conditions, the conductance may 
drop to less than one twentieth of the initial 
value before atropinization (Lader & Montagu, 


1962). For practical purposes, 
Go may be ignored, the skin conductance 
may be regarded as directly to 
the number of active sweat glands. 
Transformation of the Scale 

It has been the most general nd 
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of the GSR, measured as the change in resist- 
ance, is intimately related to the background 
level of resistance from which the change orig- 
inates(see Lacey, 1956). There are marked 
variations in the background level both be- 
tween and within individuals. Consequently, 
in order to obtain a valid comparison between 
GSRs, the literature on the appropriate unit of 
measurement has been dominated by the ap- 
parent need to control for the dependence on 
background level. This has led to a wealth of 
suggested transformations of the basic units, 
which have emerged as a result of statistical 
maneuvers. As Lacey (1956) has pointed out, 
however, these transformations have often 
been applicable only to the original data 
from which they were developed. Moreover, 
the underlying physiological mechanisms have 
been ignored in the search for statistical ac- 
ceptability. 

The universal validity of the need for 
independence from the background level 
should not be accepted without question. The 
following example may serve to illustrate this 
point. There is evidence that levels of arousal 
may differ within certain clinical psychiatric 
groups, such as neurotics and schizophrenics 
(e.g., Claridge, 1961; Venables & Wing, 
1962). The background level of skin con- 
ductance is regarded as an indicant of the 
level of activation or arousal (Woodworth & 
Schlosberg, 1954). Consequently, if an experi- 
ment has been designed to compare the 
responsiveness to a standard stimulus of two 
groups differing in arousal level, the use of a 
unit of measurement that has been selected 
because it is independent of background level 
may defeat the object of the investigation. 

It is important to realize that the GSR 
per se is of little interest to the psychologist. 
Neither is the actual sweating response. Its 
value lies in its use as a physiological indicator 
of psychological events. This point was elabo- 
rated by Champion (1951). At that time, he 
stated that the full use of the GSR as an in- 
dicator awaits information regarding the func- 
tional relationship between it and the in- 
dicated dimension. That information is still 
awaited. However, Darrow (1937) has em- 
phasized that psychological events are seldom 
linearly related to physiological effects. Bio- 
logical systems tend to obey logarithmic laws. 


J. D. MONTAGU AND E. M. COLES 


On these grounds, he proposed the use of a 
log conductance scale. Recently, Lader (1963) 
has drawn attention to the significance of e 
pressing the GSR on this scale, that is, a 
the change in log conductance. (It is impera- 
tive to note the difference between change ix 
log conductance and log change in conduci- 
ance; the two measures have sometimes bees 
confused. For this reason, the ambiguous term 
log conductance change is to be deprecated.) 

When a response is analyzed as the change 
in log conductance, the magnitude of the re- 
sponse is expressed as the ratio of the final 
conductance to the initial value at the begin- 
ning of the response, that is, a change in 
conductance from 2 to 4 »mhos has the same 
significance as a change from 20 to 40 pmhos. 
Since the conductance is approximately pro- 
portional to the sudorific activity, the ratio 
gives a measure of the relative number of 
sweat glands which are actively secreting at 
the end of the response in proportion to the 
number which were already operating before- 
hand. It must be emphasized, however, that 
the measure is meaningful in these terms 
only because the relationship between sudorific 
activity and conductance passes virtually 
through the origin (Fig. 2). A 

It is worth noting that since change n 
log conductance is a ratio, it is mathematically 
equal (although of opposite sign) to change 
log resistance. This may offer a rapid meth 
for the calculation of changes in log conduct- 
ance from resistance measurements. Neverthe- 
less, it does not give biological significance 
to log resistance as a scale of measurement. 
It is meaningful to think only in terms © 
conductance units. 


DC anD AC METHODS 


The GSR as originally described (Féré, 
1888) is a drop in the resistance of the 
to the passage of a direct current. Howeve a 
the response can also be recorded with lo 
frequency alternating currents, and bo d 
methods of measurement are still in comme 
use, The values obtained by the two meth £ 
are not the same. When ac is used, the fun a 
tion measured is the impedance (apparent 7 
resistance) or its reciprocal, admittan”” 
which corresponds to the dc conducha 
Like their dc counterparts, the ac functi 
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are also measured in ohms and mhos, re- 
spectively. 

The skin impedance is invariably lower 
than the dc resistance owing to the presence 
of capacitance, which provides a conducting 
pathway for ac only. It is represented in 
Figure 1 by the Capacitor C in parallel with 
the resistive units. Moreover, the ability of a 
capacitor to conduct ac varies directly with 
the frequency. Consequently, if the frequency 
is increased, the skin impedance decreases. 
Ultimately, with increasing frequency, the 
capacitor becomes such a good conductor of 
ac that it virtually short-circuits the resistive 
components of the skin.? For this reason, the 
GSR cannot be recorded with ac above a 
certain frequency. Gildemeister (1915) was 
unable to detect any response at a frequency 
of 5 kcps. Subsequently, Forbes (1933) and 
Forbes and Landis (1935), using more sensi- 
tive apparatus, defined the limiting frequency 
at around 10 keps. In practice, when the GSR 
is measured with ac, a frequency of 50 or 
60 cps is usually used, depending upon the 
frequency of the main’s electricity supply. 

Since the values obtained with ac and de 
differ, the question arises as to whether the 
results can be regarded as comparable, as has 
sometimes been assumed. This has recently 
been investigated by Montagu (1964), who 
recorded the dc resistance and the impedance 
at 60 cps simultaneously through the same 
electrodes. For the reasons given above, the 
dc measurements were expressed in terms of 
conductance; the ac measurements were ex- 
pressed in terms of the corresponding function, 
admittance. There were three principal find- 
ings. In the first place, conductance and ad- 
mittance appeared to be linearly related. 


Pa When this happens, the residual impedance is the 
pedance of the interior of the body, that is, Rr 
in Figure 1. This impedance varies with changes in 
Ai lume of the segment between the electrodes. 
Snsequently, the high-frequency impedance of an 
Pea can be used to record the circulatory 
nges in the part. This is the basis of the technique 
(Not as radio-frequency impedance plethysmography 
A oer, 1944). Tt is of interest in the present con- 
the on in view of the vascular theory regarding 
the mechanism of the GSR. It is now evident that 
a electrical manifestations of circulatory changes 
ol only be recorded at frequencies which exclude 
arization and membrane effects and at which the 
cannot be detected. 


Secondly, the linear correlation between 
change in conductance and change in admit- 
tance during GSRs was virtually unity in 
every case. Thirdly, conductance and ad- 
mittance were closely related between sub- 
jects in regard both to the levels (r= .99) 
and to the changes during responses (r= 
97), It was concluded that the de and ac 
methods can be regarded as comparable pro- 
vided the results are analyzed in terms of 
conductance and admittance, respectively. 
An earlier section revealed that there are 
sometimes good reasons for analyzing de re- 
sponses in terms of the change in log conduct- 
ance, that is, as the ratio of the final conduct- 
ance to the initial value. This raises the ques- 
tion of the appropriateness of change in log 
admittance as a measure of response magni- 
tude. It was emphasized that change in log 
conductance is biologically meaningful only 
because the skin conductance is virtually 
proportional to the number of active sweat 
glands (Fig. 2). With ac the case is different. 
The skin capacitance provides an additional 
conducting pathway so that the admittance 
is always greater than the conductance, The 
approximate nature of the relationship be- 
tween admittance and sweat-gland activity 
is illustrated in Figure 2 by the broken line. 
(It is probable that the actual relationship 
is slightly curvilinear but this has not been 
verified experimentally.) This shows that, in 
contrast to the conductance, the residual ad- 
mittance in the absence of sweat-gland 
activity (Vo) is too large to be neglected. 
The admittance cannot therefore be regarded 
as proportional to the number of active sweat 
glands. Under these circumstances, the change 
in log admittance would be a meaningless 
measure without biological significance. 
Tf analysis of results on a log scale is 
anticipated, it is clearly necessary to use the 
dc method. When the raw conductance scale 
is appropriate, either method may be used. 
In theory, the de method has a major ad- 
vantage in that it measures pure electrical 
functions, resistance Or conductance. Imped- 
ance and admittance are composite functions 
which are dependent upon an additional fac- 
tor, capacitance. Moreover, there is some 
evidence to suggest that there may be slow 
variations in skin capacitance which are un- 
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related to sudorific activity (Montagu, 1964). 
Nevertheless, the high correlation between dc 
and ac measurements indicates that the effect 
is unlikely to be a source of much error, 
particularly in regard to GSRs and short-term 
changes in background level. On the other 
hand, ac has certain advantages by compar- 
ison with dc. In the first place, polarization of 
the electrodes is minimized. Secondly, ac 
measurements are not contaminated by the 
natural (endosomatic) potential difference be- 
tween the electrode sites. Consequently, the 
applied ac can be reduced to a much lower 
level than is possible with dc. However, with 
ac it is essential that there should be no 
interference from stray electric fields. Any 
pick-up at the electrodes will be included 
with the signal, and if, as is usually the case, 
the frequencies are the same, it may pass 
unnoticed as a serious source of error. 


PRINCIPLES OF MEASUREMENT 


Since dc measurements are more generally 
valid than ac measurements, the principles 
underlying the measuring techniques will be 
considered in relation to the dc method. The 
same general principles are also applicable 
to ac with the exception, mentioned above, 
that the applied current or voltage can be 
much smaller. Since ac measurements are not 
subject to error from the endosomatic poten- 
tial difference, the applied voltage can be as 
low as the sensitivity of the amplifier and the 
noise level permit. 

The resistance R in a circuit is given by 
the familiar equation R = V//, in which V is 


(b) The constant-voltage 
method, by applying a steady voltage across 


sË 


is, it will vary directly 
the conductance. Although the GSR may be 
recorded by either method, the constant-cur- 
rent principle has generally been used. 
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Whichever method is used, there 
important considerations which gq 
parameters of the source. In the first 
is well known that the apparent 
offered by the tissues decreases as 
is increased at any rate above a ce 
(Galler, 1913; Gildemeister, 1915 
over, with higher voltages, the M 
drops progressively during the timed 
rent flows (Gildemeister, 1913, T 
recent study (Edelberg, Greiner, 
1960) has indicated that the resist 
independent of the current density 
is below a certain level. This th 
been interpreted as a sign of “inju 
cell membrane, and it seems necessary 
the current should be kept below this le 
consistent results are to be obtained.’ 
other hand, the voltage across the elec 
must be large in comparison with the en 
matic potential difference between the 
sites, which will otherwise intrude as a% 
of serious error. The difference beti 
palmar skin and an “indifferent” al 
reach 50-60 mv., the palm being B 
It is evident that the error attrib 
the endosomatic potential difference 
in direction with the polarity of the s 
Furthermore, it has also been shown that 
skin resistance itself varies with the dired 
of the applied current (Gildemeister, ¥ 
Edelberg, Greiner, & Burch, 1960; Re 
1944). On both these grounds, it is meces 
to observe a consistent polarity of the: 
trodes. P 


Constant-Current Method 
The principle is shown in Figure 3 A-A 
electrodes are connected in series with #1 
sistor r, the value of which must be 1af 
in comparison with the subject’s res : 
R,. The current therefore remains 3 
constant regardless of fluctuations M 
To ensure adequate constancy of the 
rent, r should be at least 10 times gf 
than the highest value of R,. Since 
resistance is large, the voltage of the 50 


With a constant-current system, the cur 
density per unit area of electrode is f 
However, the effective density per 
ical unit, which is the important facto 
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vary inversely with the number of active units. 
This effect, which may at first seem parador- 
kal, follows logically from the 
Figure |. The greater the number of 
sweat glands, the greater will be the 
of parallel pathways sharing the current. 
versely, if there are only one or two 

units, virtually the whole current must 
through them. The situation would be 
different if all the sweat glands 

quantitatively in unison, but this is 
case. The skin conductance varies 
with the number of active sweat 
(Thomas & Korr, 1957). On these grounds, 
it is evident that the maximum current which 
can be passed without exceeding the threshold 
of injury will be determined by the minimum 
number of active units, that is, by the highest 
skin-resistance value. This will depend not 
only upon the electrode area, but also upon 
the sweat-gland density as well as the func- 


THETA 


the optimal conditions meaningfully in terms 
of current density. The greater the number 
of active units, that is, the lower the skin 
resistance, the greater will be the current 
which can be passed. On this basis, the im- 
portant factor would seem to be the product 
of the current and the resistance, namely, 
the voltage, which should remain constant. 

In the technique which has been used by 
the authors for several years, a constant cur- 
Tent of 10 pa. is passed through an “active” 
tlectrode of .7 cm? on the distal phalanx of 
the thumb. Under these conditions, the re- 
corded resistances fall, with few exceptions, 
between 25,000 and 500,000 ohms, that is, 
the measured voltages range from 250 mv. to 
Š v. The higher figure is probably excessive 
in the light of Gildemeister’s (1913) findings. 
On the other hand, if the current is reduced, 
the smallest voltages will approximate the 
‘ndosomatic potential difference. The chosen 
Conditions appear therefore to be a reasonable 

promise. 

With the constant-current method, con- 
ductance values are calculated from resistance 
Measurements. The inverse relationship be- 
tween these functions has certain effects which 
are worth noting. Since a small conductance 
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the lowest value of R,. 
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With a constant-voltage source, the situa- 
tion regarding the current density is the con- 
verse of that in the constant-current method. 
Since the physiological units are in parallel, 
the voltage across each unit will be constant 
and equal to the voltage of the source. Al- 
though, therefore, the total current will vary 
with the number of active units, the current 
density per active unit will remain the same 
regardless of their number. It will depend only 
upon the applied voltage. The optimal voltage 
for the source will therefore be independent 
of the sweat-gland density and area of elec- 
trode; it will be determined solely by the 
characteristics of the individual unit. 

There is conflicting evidence regarding 
the maximum voltage which may be applied 
to the skin without altering its resistance. 
The ac measurements of Gerstner and Gerb- 
städt (1949) on the forearm indicate that the 
limiting value is in the region of .8 v. Above 
3 v., the impedance varied with the duration 
of current flow as well as with the applied 
voltage. These findings may not be strictly 
applicable to dc. Lykken (1959) found that 
the dc conductance of the back of the hand 
remained absolutely constant at voltages from 
1.5 to 7.5 v. This is in contrast to the ob- 
servations of Gildemeister (1913) on the fore- 
arm. The most applicable evidence has been 
provided by the controlled experiments of 
Edelberg et al. (1960) on the optimal cur- 
rent density for measurement of the GSR. 
In these experiments, the voltage across the 
electrodes varied up to 1.5 v. when the cur- 
rent was kept just below the mean value at 
which evidence of injury was manifested. As 
far as can be ascertained at present, therefore, 
a source of 1 v. would seem to be a reasonable 
choice. Since this is approximately 20 times 
greater than the endosomatic potential differ- 
ence, the error from this source will be small. 
If the resistance of the subject varies between 
extreme values of 500,000 ohms (2 mhos) 
and 20,000 ohms (50 mhos), the current 
will range from 2 to 50 ya. With a resistor of 
1,000 ohms for r in Figure 3 B, the meas- 
ured voltage across r will vary between 2 and 
50 mv., that is, 1 mv. per „mho of conduct- 
ance. Since these voltages are much smaller 
than those measured in the constant-current 
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method, the constant-voltage method req 
a more sensitive amplifier. With the ad 
of chopper amplifiers, this presents 
problem. 

In summary, the constant-voltage me 
would seem to possess distinct advan 
over the constant-current principle. Fi 
it gives a direct measure of cond 
Secondly, it is apparent theoretically that 
current flowing through each sweat gland 
independent of the state of activity of 
other glands. In practice, the constant-currél 
method has been generally used under 
impression that the current density throu 
the skin remains constant. It has been arg 
here that this cannot be the case. On # 
grounds, it would seem that the constat 
voltage method deserves more attention ths 
it has received. 


Bridge Circuits 


The GSR is measured as a small } 
in voltage which is superimposed upon a re 
tively large background voltage. In order # 
measure the change with accuracy, it is neces- 
sary to balance off the standing voltage & 
the nearest convenient value and to ampiiy 
the residual unbalanced voltage. A b 
circuit, in which the bias voltage is ob 
from the source which supplies the sub; 
offers one way of doing this. Such ci 
were used almost universally before electro 
amplifiers came into general use—the 0 
recording instrument available then W 
sensitive galvanometer. $ 

A bridge circuit is not an alternative t 
the constant-current and constant-vottag 
methods. It must still obey one or © 
principle if it is to give a linear scale 
measurement. The basic circuit is shown 
Figure 3 C. The resistance R, in S$ 
with the subject is equivalent to the resis 
r in Figure 3 A or 3 B, At the point | 
balance, there will be no potential diffe | 
between the points X and Y. The conditions 
for this are that R,/R, = Ro/Rs. If Re ® 
made equal to Ri, R, will equal Rs at BS 
point of balance, and the output voltage be 
tween X and Y will be proportional to the 
difference between R, and Rg. For a cons pi 


. 


à 


MECHANISM AND MEASUREMENT OF THE GALVANIC SKIN RESPONSE 


current bridge, R, must be large in comparison 
to R, and the source voltage will be high. 
R, may consist of a chain of equal resistors 
in series. For a constant-voltage bridge, Ry 
must be small in comparison to R,, and the 
source will be of low voltage. Since, in this 
case, Rs has to balance the conductance of 
the subject, it may conveniently consist of 
a set of equal resistors which are added in 
parallel to give equal increments in conduct- 
ance, 

In an ac bridge, it is necessary to balance 
both the resistive and capacitative components 
of the impedance. Bridge circuits are there- 
fore not suitable for the continuous recording 
of skin impedance or admittance. It is prefer- 
able to use the principles shown in Figure 3 
A or 3 B and to balance the signal after it 
has been amplified and then rectified. 


ELECTRODES 


General Principles 


The GSR is usually recorded with a unipolar 
electrode system, that is, with one electrode 


Fic. 4. Three-electrode technique for measurement of the skin 
(Inset shows equivalent electrical network, in which Ra, 


Electrodes A, B, and C, respectively.) 
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on an appropriate area of the hand or foot 
and the other on some “indifferent” part of 
the body. In this case, it is necessary to 


It becomes a function of resistances in series 
there 


AMPLIFIER 


resistance at a single site, A. 
Rs, and Ro are the skin resistances at 
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might appear to possess advantages. Firstly, 
the inactive electrode is obviated. Secondly, 
the endosomatic potential difference between 
homologous areas will approximate zero so 
that the applied current or voltage can be 
reduced to a much lower level than is pos- 
sible with the unipolar method. With the 
bipolar method, the measured resistance is 
the sum of the individual resistances at the 
two sites, and, by halving all the figures, a 
mean value is obtained. In terms of resistance, 
therefore, the bipolar method in effect samples 
a larger area of skin. However, in terms of 
conductance, the position is different. The two 
electrodes are in series, and conductances in 
series are an inverse function. The resultant 
conductance G is equal to G;G2/G; + Ge, which 
is not half the mean value for the two sites 
unless the individual conductances G,; and Ga 
are the same. If they are dissimilar, the total 
conductance will be determined predominantly 
by the smaller of the two components, and 
the relationship between total conductance 
and mean sweat-gland activity will not be 
linear. From experiments in which unipolar 
recordings have been taken simultaneously 
from two homologous areas, it has been found 
that their conductances sometimes differ con- 
siderably throughout the test period (cf. 
Obrist, 1963). Consequently, since symmetry 
cannot be assumed, the unipolar method must 
be regarded as preferable to the bipolar 
system. 

There remains a third method, the three- 
electrode technique for the measurement of 
skin resistance at a single site (Barnett, 
1938; Horton & van Ravenswaay, 1935). 
This is illustrated in Figure 4. It is a modifi- 
cation of the unipolar system in which the 
inactive electrode is replaced by two separate 
electrodes, B and C. The constant current is 
applied through the active Electrode A and 
Electrode B; the voltage is recorded from 
Electrodes A and C. If the measuring device 
is an amplifier with a high input resistance, 
there will be virtually no current flowing 
through the measuring circuit A-Amplifier— 
C-A. Consequently, the skin resistance at 
Electrode C will not influence the measure- 
ments, These will depend only upon the volt- 


age drop across the skin at the common 
Electrode A. If Electrode C is now placed 
upon the contralateral homologous site to the 
active Electrode A, as shown in Figure 4, 
there will also be a negligible error attribu- 
table to the endosomatic potential difference 
since this will approximate zero. The ar- 
rangement has been used by one of the 
authors (J.D.M.) with satisfactory results. 
It has been tested by switching the amplifier 
from Electrodes A and C to A and B, that is, 
to the usual unipolar connections. This re- 
sults in an increase in the measured resis- 
tance, which is attributable to the inclusion 
of the resistance at the inactive Electrode B. 
When the resistance at this site alone has 
then been measured by the three-electrode 
technique, the value obtained has been found 
to agree. 

Although the three-electrode system does 
not appear to have been used previously in 
connection with the GSR, it would seem to 
be the method of choice for skin resistance 
measurement, that is, when the voltage is 
measured. It will be apparent, however, that 
the system cannot be used for skin conduc- 
tance measurements with a constant-voltage 
source, Since in this case it is the current 
flowing through the subject which is meas- 
ured, the current must inevitably flow 
through both recording electrodes. 


Polarization 


Polarization refers to the generation of i 
electromotive force in the opposite direction 
to that of the original current, that is, it is a 
back EMF. This behaves as an apparent 
resistance which, in the case of electrodes, 15 
in series with the subject. The effect is to 
cause an apparent gradual increase in the 
subject’s resistance above the true value. 

There are four methods whereby polariza- 
tion of electrodes may be minimized: (a) 
Reduction of the current density. The amount 
of polarization varies with the current per 
unit area of the electrode. (b) Periodic t°- 
versal of the polarity, Any polarization a: 
fects which build up will tend to be reversè 
during the ensuing half cycle. (c) The men 
ac. This is in effect a simple method of "®- 
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versing the polarity. (d) The use of non- 
polarizable electrodes. 

The first three methods have a limited ap- 
plication in the measurement of the GSR. 
It has already been seen that the applied 
current or voltage must be determined on 
other grounds. Merely increasing the size of 
the active electrode will not affect the issue 
since the current will either increase pro- 
portionately for the same applied voltage or 
it must be increased proportionately to give 
the same voltage across the electrodes. Either 
way, the current density will remain the same. 
Periodic reversal of the polarity every few 
minutes suffers from the disadvantage that 
the skin resistance, which is itself a polariza- 
tion phenomenon, also varies with the direc- 
tion of the current (Edelberg et al., 1960; 
Gildemeister, 1928; Rosendal, 1944). On the 
other hand, if the polarity is reversed con- 
tinuously by using ac, the skin capacitance 
intrudes into the measurements. The capaci- 
tative effects become negligible if the fre- 
quency is made sufficiently low, and this 
offers a practical method of reducing elec- 
trode polarization. However, low frequency 
ac does not eliminate polarization completely 
(Lykken, 1959). The only method of achiev- 
ing this is by means of nonpolarizable elec- 
trodes, 
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Nonpolarizable Electrodes 


With one exception, polarization always 
occurs when a current passes from a metallic 
conductor to a saline solution, The exception 
is when the metal is in contact with one of its 
own salts. This is the basis of all nonpolariz- 
able electrodes, In practice, only two metal- 
salt combinations have been found to be 
suitable for skin-resistance measurements 
(Lykken, 1959). They are the familiar silver- 
silver chloride and zinc-zinc sulphate elec- 
trodes. Readers are referred to Lykken’s arti- 
cle for details of their preparation, The silver- 
silver chloride electrode suffers from the 
disadvantage that the chloride layer is stripped 
off the cathode by the passage of the measur- 
ing current. Consequently, it is necessary to 
ensure that the electrode has been chlorided 
to a sufficient depth over the whole surface to 
last for the duration of the experiment. Other- 
wise, it will lose its nonpolarizability. The 
zinc-zinc sulphate electrode does not have 
this disadvantage since the zinc sulphate, be- 
ing soluble, is contained in the electrode 
jelly. On the other hand, some evidence has 
been advanced to suggest that the Zn** ion 
may affect the skin resistance (Edelberg & 
Burch, 1962). It seems possible that this 
effect was attributable to the concentration 
which was used, since similar effects were ob- 
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tained with several other agents, including 
sodium chloride in the same concentration 
(Edelberg et al., 1960). Nevertheless, the 
possibility of an adverse effect of the Zn** 
ion should be born in mind pending further 
evidence, 

The most suitable electrode for skin-re- 
sistance measurements is undoubtedly the 
double-element electrode developed by Lyk- 
ken (1959). Strictly speaking, this electrode 
is not nonpolarizable; polarization occurs, but 
it is excluded from the measuring circuit. The 
principle is similar to that used in the three- 
electrode technique for excluding skin resis- 
tance at the inactive site (Fig. 4). The dif- 
ference lies in the fact that the two elements 
of a double-element electrode are connected 
by a saline bridge to the same area of skin. 
Each electrode consists of a central disk and 
an outer annulus, both made of lead, in con- 
tact with a sodium chloride jelly. The princi- 
ple of operation is illustrated in Figure 5, in 
which the two elements are shown side by 
side. The measuring current J; is applied 
through the annular element of each elec- 
trode; the voltage developed between the 
electrodes is tapped off from the disks. If the 
disks are connected to an amplifier with a 
high input resistance, the current J2 flowing 
through them will be virtually zero. Conse- 
quently, polarization will be negligible. The 
annuli, on the other hand, become polarized, 
but the apparent resistance attributable to 
this polarization (rp) is external to the meas- 
uring circuit and does not influence the re- 
sults. 

If, in Figure 5, the two elements of one of 
the electrodes (the inactive electrode) are 
separated into two distinct electrodes, the 
result will be a three-electrode system (Fig. 
4) with a double-element active electrode. In 
this case, both polarization and skin resis- 
tance at the inactive site will be excluded 
from the measurements. This would seem to 
be the system of choice for skin resistance 
measurements. It will be evident that double- 
element electrodes, like the three-electrode 
principle, cannot be used for skin-conductance 
measurements, that is, when the current is 
measured. In this case, it is necessary to use 
conventional nonpolarizable electrodes, 
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Active Electrode 


Size. If the sweat-gland density is homo- 
geneous, the skin conductance will vary di- 
rectly—and the resistance inversely—with the 
area of skin from which measurements are 
made. This follows logically from the parallel 
arrangement of the physiological units, and 
it has been verified experimentally (Blank & 
Finesinger, 1946; Edelberg & Burch, 1962; 
Lykken, 1959). Since homogeneity of sweat- 
gland density cannot be assumed, even in 
specific localized areas, it is clearly impor- 
tant to standardize both the area and exact 
site from which recordings are made if com- 
parisons between and within subjects are to 
be valid. 

Blank and Finesinger (1946) showed that 
the effective electrode is the area of skin 
which is in electrical contact with the metal 
electrode through the medium of electrode 
jelly or sweat. The area of the metal is ir- 
relevant. It is therefore necessary to delimit 
the required area of skin with a masking 
device. A convenient device for this purpose 
is a self-adhesive annular corn plaster, the 
central hole of which is then filled with elec- 
trode jelly (Lykken, 1959). This has the 
additional advantage of controlling the elec- 
trode pressure, a factor which also influences 
the measurements (Gerstner & Gerbstiadt, 
1949). 

Site. Sweating is not uniform over the body 
surface (Kuno, 1934, 1956). In general, the 
sweat glands are concerned with temperature 
regulation. But in certain areas, notably the 
palmar and plantar regions, sweating is less 
influenced by temperature and is primarily 
associated with mental excitation. It is within 
these areas that the active electrode must be 
placed. 

Since the GSR is a manifestation of sweat- 
gland activity, it would seem logical to record 
from one of the areas where the glands are 
most densely populated. In the hand, these 
are the volar surfaces of the distal phalanges, 
the pads at the bases of the fingers, and the 
thenar and hypothenar eminences (Kuno, 
1934, 1956). The volar surface of a distal 
phalanx is the most convenient position for 
the secure attachment of an electrode. How- 


| 


| 


i 


MECHANISM AND MEASUREMENT OF THE GALVANIC SKIN RESPONSE 


ever, such sites are prone to cuts and punc- 
tures, which may greatly. reduce the skin 
resistance. Consequently, the site should al- 
ways be inspected before the electrode is 
applied. The exact position of the electrode 
may conveniently be standardized by placing 
it centrally over the central whorl of the 
finger- or thumbprint (Lykken, 1959). 

Preparation, The less the preparation at 
the site of the active electrode, the smaller is 
the likelihood that the conditions to be 
measured will be altered artificially. The use 
of grease solvents, for example, carbon tetra- 
chloride, is not to be recommended. It has 
been suggested that they may have a harm- 
ful effect on the semipermeable membranes 
of the sweat glands (Lader, 1963). It is 
usually suwScient to ensure that the subject 
washes his hands with soap and water a few 
minutes beforehand. 


Inactive Electrode 


Preparation of the site of the inactive elec- 
trode is designed to produce an area which 
contributes minimally to the total resistance. 
Provided this has been achieved, the size and 
site of the electrode are immaterial. Never- 
theless, a large electrode will facilitate the 
attainment of a low resistance, in addition to 
minimizing polarization effects. The dorsum 
of the forearm is a convenient site which has 
often been used. 

Preparation. The usual methods of mini- 
mizing the resistance are by abrading, drill- 
ing, or piercing the skin for the purpose of 
removing or short-circuiting the high-resist- 
ance locus, The relative merits of these meth- 
ods have been discussed by Venables and 
Sayer (1963), who concluded that “an ade- 
quate degree of sandpapering probably 
achieves the same result as rather heroic 
amounts of drilling.” Unfortunately, there is 
no consistent visual indication of what con- 
stitutes an adequate degree of sanding. The 
Production of an erythema is certainly no 
valid guide. A series of measurements taken 
by one of the present authors (J-D.M-) from 
the dorsum of the forearm after this had been 
sanded briskly with an emery board revealed 
that the residual resistance was sometimes far 
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from negligible. Subsequently, the resistance 
at the inactive site was measured routinely, 
as described below. It was found that in 11 
of the first 100 cases, the residual resistance 
after abrasion was greater than 10,000 ohms 
(electrode diameter 2.5 cm.), Moreover, with 
one exception, the highest values occurred in 
young female subjects. This unexpected ob- 
servation suggests that the finer the texture of 
the skin, the more difficult it is to remove the 
surface layer, in contrast with a tough scaly 
surface. 

As a result of these findings, the authors 
agree with Venables and Sayer (1963) that 
the only objective method of ensuring that 
preparation has been adequate is to measure 
the resistance at the inactive site. This should 
be done every time or until it has been 
shown that the technique produces the re- 
quired results every time. The simplest method 
is to apply the inactive electrode and to ask 
the subject to hold the active electrode on 
the moistened mucous membrane of the lip, 
that is, on a surface of negligible resistance. 
This gives rise to a potential difference 
between the electrodes as a result of the 
dissimilarity in the contact media. If a con- 
stant current is now through the elec- 
trodes, the additional deflection of a volt- 
meter connected across them will be propor- 
tional to the resistance between them, which 
is virtually the resistance at the inactive 
electrode. A more refined method is the three- 
electrode technique illustrated in Figure 4. 
For measuring the resistance at the inactive 
Electrode B, the third Electrode C may be 
attached to the contralateral forearm. A con- 
stant current is applied through Electrodes A 
and B, as shown in Figure 4, and the voltage 
is measured between B and Č. This will be 
proportional to the resistance at Electrode B 
ee resistance measurements are made 
by the constant-current method, it is clearly 
preferable to use the three-electrode system 
to exclude the resistance at the inactive site, 
as described earlier, rather than to measure 
it, In this case, preparation at the supply 
Electrode B, as shown in Figure 4, need not 
be stringent, while the voltage reference site, 
C, should be treated in the same way aS the 
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active site, A. The above paragraphs regard- 
ing preparation of the inactive site are par- 
ticularly applicable to conductance measure- 
ments, for which a conventional unipolar ar- 
rangement must be used. 


Electrolytes 


Since the resistance of the skin is affected 
by its water content (Blank & Finesinger, 
1946), the contact medium should be isotonic 
with sweat. This consists largely of Na and 
Cl which, according to various estimates 
(Kuno, 1956), are present in approximately 
equimolecular proportion in a concentration 
equivalent to about .05 M NaCl. Contact 
media for use with Ag-AgCl or double-ele- 
ment electrodes should therefore contain 
NaCl in this concentration. In the case of 
Zn-ZnSo, and Zn—ZnCls electrodes, the same 
tonicity would be given by .076 M ZnSo, and 
.036 M ZnCle. 

Gum tragacanth or bentonite clay have 
sometimes been recommended as bases to 
give the required consistency. However, Edel- 
berg and Burch (1962) have pointed out that 
these substances are not biologically inert. 
These authors give a method for preparing a 
medium with an inert base of cornstarch. 
Simple agar (not nutrient agar) or methyl 
cellulose should also be satisfactory. 

It is necessary to use the same electrolyte 
medium at both electrodes in order to pre- 
vent the generation of an EMF. For this rea- 
son, concentrated commercial jellies should 
not be used at the inactive site as a means of 
reducing the resistance. 


EXPERIMENTAL VARIABLES 


The variables relating to measurement 
technique, which have been discussed in the 
preceding sections, form only one of the 
groups of variables which affect the results 
obtained, Two other groups are the environ- 
mental and organismic variables. These are 
equally important, and they should be con- 
trolled where possible. 

Assessment of the effects of these variables 
has been made difficult by the diversity of 
experimental techniques employed by earlier 
workers, This has limited the comparability 


J. D. MONTAGU AND E. M. COLES 


of their results. Moreover, a distressing num 
ber of potentially valuable studies have been 
quantitatively invalidated by the use of units 
of measurement which are now seen to be 
inappropriate. This can seldom be righted by 
mathematical transformation of the published 
figures unless all the measurements have 
been given. For example, the mean skin 
conductance of a population cannot be caleu- 
lated from the mean of resistance readings. 
Likewise, there is no fixed relationship be- 
tween change in resistance and change in 
conductance, When the resistance scale has 
been used, the findings in regard to the back- 
ground level can usually be interpreted in 
terms of conductance. In the case of the 
GSR, on the other hand, the use of inap- 
propriate measures may lead to conclusions 
which are entirely erroneous. For these rea- 
sons, the following review is essentially a 
qualified list of the variables which should be 
taken into account in the planning of experi- 
ments, The results of earlier studies in regard 
to the background level are summarized in 
terms of conductance although they may 
have been presented in resistance units. Re- 
ports on the GSR, which are fewer in number, 
are only included when the conductance or 
log conductance scales were used. 


Environmental Variables 


Temperature. A number of earlier workers 
(Cattell, 1928; Duffy & Lacey, 1946; Free- 
man & Giffin, 1939; Smith, 1937; Wenger, 
1948) found no relationship between conduc- 
tance level and room temperature, but a rela- 
tionship was subsequently reported by Conk- 
lin (1951) who worked with effective tem- 
peratures in the range 70-86° F (21-30° C). 
Eysenck (1956) obtained a positive correla- 
tion of .35 between log conductance level and 
room temperature in a group of neurotic pa- 
tients, but she found no evidence of any 
relationship in a psychotic group or a healthy 
control group. Previously, Venables (1955) 
had shown that the GSR, expressed as the 
percentage change in conductance, was af- 
fected by room temperatures above 66° F 
(19° C) in neurotic patients, whereas M 
normal subjects there was no effect with 
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temperatures up to 73° F (23° C). Both 
these observations suggest that in studies on 
the effect of room temperature, the crucial 
factor may sometimes be the “stress” of 
extreme environmental conditions rather than 
the temperature itself. Neurotics may be ex- 
pected to have a lower stress tolerance than 
normal subjects. 

Recently, Maulsby and Edelberg (1960) 
have demonstrated a linear relationship be- 
tween log resistance (and therefore in oppo- 
site sense log conductance) and skin tem- 
perature over a wide range. It appears that 
these workers recorded from a whole segment 
of a finger, that is, from a moderately large 
area including both volar and dorsal surfaces. 
This suggests the possibility that their results 
might have been attributable to the expansion 
and contraction of the high conductance area 
with a rise and fall in temperature (Richter, 
Woodruff, & Eaton, 1943) rather than to a 
uniform change over the surface. Neverthe- 
less, their interesting finding provides a clear 
indication of the need to control the tem- 
perature variable. 

Humidity, Several workers (Cattell, 1928; 
Eysenck, 1956; Freeman & Giffin, 1939; 
Smith, 1937; Wenger, 1948) have found no 
evidence of a relationship between humidity 
and skin conductance. However, Venables 
(1955), using groups of healthy and neurotic 
subjects, obtained evidence in both groups of 
a negative correlation between conductance 
and humidity between 54 and 66% relative 
humidity, Above and below this range, the 
results suggested a positive relationship. 

Time of day, Skin conductance is generally 
lower at night than during the day (Farmer 
& Chambers, 1925; Freeman & Darrow, 1935; 
Landis & Forbes, 1933; Muller, 1904; Rich- 
ter, 1926), reaching a maximum level around 
midday (Waller, 1919; Weschler, 1925). 
Within this cycle, minor variations may be 
observed at meal times, with a fall preceding, 
and a rise following a meal (Farmer & Cham- 
bers, 1925), 


Organismic Variables 


Age, Obrist (1948) and Jones (1949) have 
Teported that skin conductance increases 


no relationship between age and background 
level. In this context, it is interesting to note 
that MacKinnon (1954), using a group of 
123 healthy males with ages from 7 to 96 
years, found that the number of active sweat 
glands per unit area of the 
middle finger decreased significantly in suc- 
cessive 20-year periods. However, this was 
not confirmed by Hellon and Lind (1956); 
who found no difference between two age 
groups either in cool or in hot environments, 

Sex. During adolescence, girls tend to have 
a lower conductance level than boys (Jones, 
1949). Kawahata (1960) has stated that, in 
general, males tend to sweat more than fe- 
males, Montagu (1963) found a difference 
between the sexes in respect of the variabil- 
ity of the GSR when subjects were retested at 
intervals over a period of 28 days. The fe- 
male subjects displayed greater intraindivid- 
ual variability between tests, which was sig- 
nificant at the p< .001 level. It was sug- 
gested that this might have been related to 
the menstrual cycle. 

Race. Johnson and Corah (1963) have re- 
ported that, in two laboratories using differ- 
ent-aged subjects and different techniques, 
Negro subjects were observed to have a lower 
skin conductance than a comparable white 
population. 

Personality traits, Relationships have been 
reported between introversion/extraversion 
and both level of conductance (Cattell, 1928; 
Jones, 1950) and GSR (Jones, 1950), but 
Martin (1960) failed to observe either. The 
conflicting results may be partly attributable 
to the different criteria of introversion/extra- 
version. Martin used the Maudsley Person- 
ality Inventory; Cattell and Jones did not. 

Intelligence. A negative correlation between 
intelligence and the level of conductance 
“reached and sustained after a period of ac- 
commodation lasting 20 minutes” has been 
reported by O’Connor and Venables (1956) 
in imbeciles. As a result, O'Connor and Vena- 
bles predicted a difference in skin conduc- 
tance between imbeciles and normal subjects, 
which they verified experimentally at the $ 
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< 01 level of confidence. It is interesting to 
speculate again upon the influence of stress on 
this relationship, People of low intelligence 
might be expected to have greater difficulty 
in adjusting to their environment, which 
might therefore be more stressful for them. 

Habituation and adaptation, It has long 
been known that the GSR tends to diminish 
in size with repetition of the stimulus, The 
diminution may be observed when the stimu- 
lus is repeated at short intervals within the 
session (Farmer & Chambers, 1925; Peterson 
& Jung, 1907) or at longer intervals from day 
to day (Porter, 1938). In both cases, the 
diminution in the response is often associated 
with a drop in the level of the skin conduc- 
tance (Conklin, 1951; Davis, 1934; Duffy & 
Lacey, 1946). The terms habituation and 
adaptation have been variously used to de- 
scribe these phenomena. It is suggested that 
habituation should be reserved for the decre- 
ment in the GSR, and adaptation for the 
drop in conductance level on repetition of the 
stimulus or test situation. 

Recently, Lader (1963, 1964) demon- 
strated that when a stimulus was repeated 
monotonously within the session, there was 
a linear relationship between the size of the 
GSR, expressed as the change in log conduc- 
tance, and the logarithm of the stimulus num- 
ber. Subsequently, Montagu (1963) obtained 
a similar linear relationship between sessions 
when the GSR, expressed in the same units, 
was plotted against the test number on a log 
scale, These relationships, where applicable, 
offer a means of correcting for habituation in 
studies in which this cannot be controlled. 

Mental health. Earlier workers who looked 
for associations between mental illness and 
skin resistance reported findings which were, 
for the most part, utterly conflicting. The 
confusion has been well summarized by Lan- 
dis (1932), who pointed out that it was vir- 
tually possible to balance each report with 
another of opposite result. The position is 
little changed today. The most careful and 
convincing recent study has probably been 
that of Eysenck (1956), who compared a 
group of 123 healthy subjects with groups of 
55 mixed-neurotic and 55 mixed-psychotic 
patients, She found that the psychotic group 
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had a signiñcantly lower log conductance 
than the control group throughout the 

On the other hand, the neurotic subj 
could only be discriminated from the o 
group on one score, the initial level 
conductance, which was lower in the p 

group. After 5 minutes of resting recordi 
the difference was no longer evident. 
initial lower conductance in the neun 
group is somewhat surprising in view of thé 
fact that 739% of the group were dysthyn 
who might be expected to be more anxio 
initially than normal subjects. It is sugge 
that the result might have been attributi 
to test familiarity on the part of the patients 

The findings of Eysenck in regard to the 
GSR are of particular interest in view of tht 
points made in an earlier section about mathe 
matical transformations of the unit of meas 
urement. Eysenck investigated a number í 
transformations, and she found that 
which discriminated between the groups ¢ 
did so by virtue of their correlation with 
background level. No measure was fi 
which was both independent of backgroum 
level and significantly different between © 
groups, 

Other variables. Among other orga 
variables which have been observed to 
the skin conductance, the following deser 
mention: physical condition (Boruttau 
Mann, 1909; Gerstner, 1948; Gouge! 
1947); bodily activity (Blank & Finesing 
1946; Freeman & Simpson, 1938; Pinned, 
1961; Sidis & Kalmus, 1908, 1909; Sidis & 
Nelson, 1910; Wenger & Irwin, 1936; t 
1930); and mental work (Conklin, 19515 
Davis, 1934; Freeman & Darrow, 1935). r, 
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LARGE-SAMPLE MULTIPLE COMPARISONS * 


LEONARD A. MARASCUILO 
University of California, Berkeley 


multiple 


i based upon a xë analog of Scheffé’s Theorem 


comparisons 

(1959) are illustrated by means of 5 examples, The examples involve the correlation 
coefficients of K independent bivariate normal populations; the parameters of K 
independent binomial populations; the interaction measures of K independent 
continency tables; the parameters of K independent normal populations with un- 
equal variances; and the differences between the parameters of K sets of paired 
normal populations with unequal variances. In addition, a general test statistic is 
presented to test the null hypothesis that involves the parameters. 


Most behavioral scientists no longer find a 
simple rejection of the null hypothesis suffi- 
cient. Most prefer to follow a decision to reject 
a tested hypothesis with a post hoc analysis 
of certain linear contrasts of the parameters 
to determine the sources of variation that are 
most likely responsible for the rejection of the 
hypothesis. 

If the decision to reject the null hypothesis 
is based upon an F ratio used in a Model I 
analysis-of-variance design, a possible ex- 
planation for the rejection can always be found 
by using any number of multiple confidence- 
interval comparisons methods, the most pop- 
ular being those of Scheffé (1959) or Tukey 
(1953). Similar methods for nonparametric 
analysis-of-variance designs based on ranks 
have been obtained by Dunn (1964) and Steel 
(1960). Goodman (1964a, 1964b) and Gold 
(1963) have derived similar techniques for 
binomial and multinomial models. It is not 
generally known that these methods can be 
easily extended to include contrasts involving 
correlation coefficients, interaction measures of 
contingency tables, and many other standard 
statistical measures. All of these methods 
follow directly from a theorem analogous 
to Schefié’s Theorem, but based upon the 
chi-square distribution instead of the F 
distribution. 

This general theorem is stated in this paper. 
It is then used in a number of examples taken 
from psychological and educational research. 


1 The research reported herein was performed pur- 
suant to a contract with the United States Office of 
Education, Department of Health, Education, and 
Welfare, Contract No. OE-5-10-152. The author would 
like to thank Seong Soo Lee and Joel Levin for their 
assistance in the preparation of the final statistics. 


While the familiar multiple-contrasts methods 
are valid for all sample sizes provided that the 
assumptions are satisfied, the ..,ethods to be 
presented here are all based u; n asymptotic 
distribution theory and are therefore large- 
sample methods. As would be expected, these 
methods reduce to the familiar small-sample 
methods whenever the small-sample assump- 
tions are satisfied. While evidence is not 
presented, it is believed that the procedures 
may be used even if the small-sample assump- 
tions are not satisfied; however, Monte Carlo 
sampling investigations of this statement are 
needed. 

To aid the reader, let us recall that a linear 
contrast of a statistical model is a function ¥ 
of the following form 


Y = a0, + aO: +---+ axOx 


where a + a2 +- - -ag = 0 and Oi, Os, +: *, QK 
are the parameters of the model. Let ô, 6:, 
-++, Ox be unbiased, independent, and asymp- 
totically normally-distributed estimates of the 
unknown parameters under the hypothesis 


that they may be different. Then 
f = 4,0; + aô: +--+ axOx 

is an unbiased estimate of the corresponding 
contrast. Let var(6,) be the large-sample 
estimate of the Var(,) and let var(¥) be the 
large-sample estimate of the Var (f). Then the 
xX? analog of Schefié’s Theorem may be stated 
as follows. f 

Theorem. In the limit the probability 1s 
(1 — a) that simultaneously for all linear Con- 
trasts of the form y = a101 + «+: + axOK 


Ê — VXx_2(1 — a) Vvar(¥) <y<d 
+ VXK2(1 — a) var(). 
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It should be noted that this probability state- 
ment is a true statement independent of the 
truth of the null hypothesis. However, the 
theorem is most useful for the case in which a 
rejection of the null hypothesis has been made. 
This statement will be clarified. 

Let 6) be an estimate of the unknown 
common value of the parameters under the 
hypothesis that they are identical. Let W: 
= 1/var(Ô:) be estimates of W, = 1/Var(6,). 
Then an easy-to-compute statistic that may 
be used to test the hypothesis of equal param- 
eter values is given by 


K, (Ô; — ôo)? 


ia 2: var (ô) 


= £ W, (ô: — ô)?. 
bal 


Following the examples illustrating the use of 
the X? analog of the Scheffé Theorem, a 
heuristic argument is presented showing that 
U's has a X? distribution with (K — 1) de- 
grees of freedom. Furthermore, the validity 
of U's as a test statistic for the null hypothesis 
is presented. 

As was suggested by one of the referees, the 
X analog of the Scheffé Theorem could be 
stated in terms of the statistic as follows: If 
U'a > Xk? (1 — a), then either the null hy- 
pothesis is true and an event of probability « 
has occurred, or there is at least one linear 
contrast of the parameters which is different 
from zero. Thus if U's < Xz (1— a) and 
the decision is made not to reject the hy- 
pothesis, then all confidence intervals com- 
puted on a post hoc basis will be certain to 
include the value of zero. On the other hand, 
if U's > Xx_2(1 — a) and the decision is made 
to reject the hypothesis, then there exists at 
least one confidence interval that will not 
include zero. 


EXAMPLE ONE 


M ultiple Comparisons among the Correlation 
Coefficients of K Independent Bivariate Normal 
°pulations 


This example with K= 5 involves the 
Measurement of the correlations between two 
Standardized tests given to fifth-grade children 
five urban elementary schools in the Oak- 

d, California, Unified School District. The 
two tests are the Kuhlmann-Anderson In- 

Sence Test and the Paragraph Meaning 


281 


section of the Stanford Achievement Test. 
The sample sizes and the correlation coeffi- 
cients for the five schools are shown in Table 1. 
As can be seen, the range in correlation coeffi- 
cients is sizeable. Since these particular schools 
were located in neighborhoods spreading over 
broad socioeconomic strata, one would not 
expect the correlations to be necessarily uni- 
form. Using the techniques presented here, 
a test of this assumption is made. In par- 
ticular, the hypothesis 
Ho ip: = pr = pa = pi = ps = po 

is tested against the alternative hypothesis 
that Ho is false. 


Normal-curve theory is introduced through 
the familiar Fisher transformation. For this let 


tr 


1- r` 


ô = Z, = } loge 


For large N+ it is known that 
1+ p 
1— p: 


Var(6,) = Var(Z,) = = 


O: = E(6x) = E(Zx) =} loge 


With this transformation, 


Pe Š (Z-Z _ > Wa(Zs — Zo) 


ki var(Z;) ba 
Lip ae EL E 
kal 
with 
Zo= Wile f EW 
ran imi 


Note that the W, do not have to be estimated 
for this example. Computations for the test are 
summarized in Table 1. 


TABLE 1 
COMPUTATIONS FOR EXAMPLE ONE 


Tk .66 

NM, |58 

Zr .793 
var(Z,) | .0182 
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For this example 
Zo 
and 


U'o=55(.793—.814)*+- 65 (.867 — .814)* 
+110(.829—.814)*-+34(1.589—.814)* 
+88 (.472—.814)?=30.94 


Since five correlation coefficients are being 
compared, U's is asymptotically chi-square 
with four degrees of freedom if Ho is true. 
For a = .05, chi-square equals 9.49, and, as 
a result, Ho is rejected. Therefore, it is known 
that at least two of the correlations are 
diferent or that some contrast of them is 
significantly different from zero. For this ex- 
ample, the most interesting contrasts are the 
simple differences between the correlations. 
The general form of the (1 — a)% simultane- 
ous confidence intervals about the contrasts 
is given by 


(Zr: — Zm) 
+ VXx-2(1 — a) Ve + ae 
a Ni,—3 Ny 3° 
For this example, 


VXx-2(1 — a) = VX2(.95) = V9.49 = 3.08. 


Of the 10 simple contrasts, the 4 that involve 
School 4 are statistically significant. The simple 
contrasts are as follows: 


— 64<2;—Z2< 49 Not Significant 
— 54<2,—2Z;< 47 Not Significant 
—1.47 < Zı — Z4 < —.12 Significant 


— .21 < ŽZı— Z< 8&5 Not Significant 

— .44 < Z:— Z< .52 Not Significant 

—1.37 < Z — Z4 < —.07 Significant 

— .11 < Z:— Zs< .90 Not Significant 

—1.36 < Za — Z4 < —.16 Significant 

— .08 < Zą— Zs < .80 Not Significant 
.50 < Z4 — Zs < 1.74 Significant 


Post hoc inspection of the correlation coeffi- 
cients suggests that the correlation for Schools 
1, 2, and 3 are equal and as a group might 
differ from the correlation for School 5. This 
post hoc hypothesis can be tested as follows: 


55(.793) + 65(.867) + 110(.829) + 34(1.589) + 88(.472) _ 
my 55 + 65 + 110 + 34+ 88 
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A contrast associated with this hypothesis is 


= }(Z1+ Z:+ Z) — Zs 
=4(.793 + .867 + .829) — 472 = 3538 


An estimate of Var(ĝ) is given by 


var() = $ var(Z;) + $ var(Z:) 
+ } var(Z3) + var(Zs) 
= $(.0182) + $(.0154) 
+ $(.0091) + .0114 = 0161. 


The 95% confidence interval for y is 


Ê — Vx2(.95) Vvar() < y < ¢ 
4 Vx2(.95) Vvar(ĝ) 
358 — 3.08V.0161 < y < .358 + 3.08V.0161 
— 032 < Ņ < 748 


The result is not significant. The post hoe 
hypothesis concerning these four schools is not 
supported. 

Note that 11 individual confidence intervals 
have been investigated with an overall proba- 
bility of a Type I error equal to .05. If each 
confidence interval had been determined with 
Z = 1.96, the familiar normal-curve value, the 
probability of at least one Type I error among 
the 11 intervals would be less than or eq 
to .55, a rather high probability of error. When 
the chi-square value of 3.08 is used, the in- 
tervals will be wide, but more important, the 
corresponding probability of at least one TyP® 
Terror is less than or equal to.05. =, 

It should be mentioned that the statistic U's 
does not give a new test for the hypothesis 
equal correlation coefficients. This test stè- 
tistic is well known and may be found in 
many advanced books in statistics. For ex- 
ample, see Hays (1963, p. 532), or Rao (1952, 
p. 233). What is new, however, is the multiple- 
contrast method for detecting the sources 
the differences between the correlations. 

One should note that the Fisher Z trans 
formation is not essential for the proper US! 
of this method. With large samples, one ©° 
use the sample correlation coefficients because 


l 
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their asymptotic distributions would be nor- 
mal. However, if the sample sizes are small, 
then the Fisher Z transformation must be 
used to introduce normality. 


EXAMPLE Two 


Multiple Comparisons among the Parameters of 
K Independent Binomial Populations 


This example with K = 3 is from a study of 
responses to a question appearing on a mailed 
questionnaire sent to a random sample of 
adults in Berkeley, California. The purpose of 
the survey was to measure the attitudes of 
the public concerning some proposed changes 
designed to improve the racial balance within 
the public schools of the community. The 
question was as follows: 

Now Berkeley Schools divide children into classes on 
the basis of how well they did in earlier classes. This 
sometimes leads to racial segregation in the classrooms. 
{It has been] suggested [that] fewer “ability group- 
ings” be used. This would cut down on segregation and 


would produce classes with children who have a larger 
range of ability. 


Tagree___ I disagree__ I am not sure. 


A stratified sampling procedure using census 
tracts as the strata was employed. On the 
basis of 1960 census-tract data, the 28 census 
tracts of the community were divided into 
three education groups. The 9 census tracts 
with the lowest reported median years of 
education were classified as low-education 
tracts; the next 10 were classified as medium- 
education tracts; and the 9 tracts with the 
highest median years of education were 
Classified as high-education tracts. Responses 
to the question classified on the education 
Variable are summarized in Table 2. The 
hypothesis tested is that the proportion of 
adults who agree with the question is un- 
related to the education level of the tracts in 

€ community. The statistical hypothesis is 


Po = 
and 


U's = 322.5806(.3816 — .2578)? 
+ 1136.3636 (.2807 — .2578)? 
+ 1075.2688 (.1964 — .2578)? = 9.58 


TABLE 2 
Computations ror Exrie Two 


with the alternative hypothesis being that Ho 
is false. 

Normal-curve theory is introduced through 
the set of maximum likelihood estimators of 
the parameters of the K independent binomial 
populations. For this case 


number who agree in Level $ 


6: = be = number of respondents in Level $ 


As shown in most elementary statistics text- 
books 


Or = E(6,) = E(f) = pe 
Var(6,) = Var(p,) = EE. 


With this transformation and the estimated 
variances 


Computations for the test are summarized in 


Table 2. 
For this example 


322.5806 (.3816) + 1136.3636 (.2807) + 1075.2688 (.1964) = 2578 
322.5806 + 1136.3636 + 1075.2688 


Oa ee 

Since three binomial populations are being 
compared, U's is asymptotically chi-square 
with two degrees of freedom if Ho is true. For 
a = .05, chi-square equals 5.99. As a result, 
H is rejected. Therefore, it is known that at 
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least two of the parameters are statistically 
different from one another or that at least 
one linear contrast of the parameters is signifi- 
cantly different from zero. For this example, 
the most interesting contrasts are the simple 
differences between the binomial parameters. 
The general form of the (1—a)% set of 
simultaneous confidence intervals about the 
contrasts is given by 


(Bs, — Ên) + VX — a) Peds 4 Pa 


For this example 
VXx_2(1 — a) = VX3(.95) = V5.99 = 2.45. 


Again these intervals will be wider than those 
obtained for Z = 1.96. However, the proba- 
bility of at least one Type I error among the 
set is less than .05 instead of .15 as it would be 
if Z = 1.96 were to be used. Of the three 
simple contrasts, one contrast is significant. 
It indicates that a difference in agreement 
exists between the census tracts that are either 
high or low in education. All simple contrasts 
are listed: 


—.05 < pi — a < .26 Not Significant 
03 < pi — pa < 34 Significant 
—.02 < p2— ps < .19 Not Significant 


One might ask why U’» should be proposed 
as a test statistic for testing the null hypothesis 
when the familiar Pearson formula 


agl- 


is available and leads to the same conclusions: 
x? = 9.59, U's = 9.58. To make matters even 
more confusing, Goodman (1964a, 1964b) has 
introduced another test statistic which he 
calls Y? for testing the same hypothesis. He 
has shown that F? is asymptotically equivalent 
to Pearson’s chi-square. The same is true for 
U's. For this example, Y? = 9.58 so that the 
use of this statistic leads to the same decision. 
The major reason for using Y? or U’ is that 
they give rise to the multiple-contrast method 
which is of considerable interest. In addition, 
there is a slight advantage in using U’ over 
Y? in that it is somewhat easier to calculate. 
Research assistants who have not had ex- 
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tensive mathematical training will find the 
computation procedure for U's somewhat 
easier than the corresponding procedure for 
¥*. However, it probably makes little differ- 
ence which of the three statistics is used, 
provided that the sample sizes are large. 


EXAMPLE THREE 


Multiple Comparisons among the Interaction 
Measures among K Independent Contingency 
Tables (After Goodman) 


This example with K = 3 is from the same 
study as Example 2. In this example, responses 
to the question 

For some grade schools, [it has been] suggested that 
lines be changed so that the percentage of non-white 
and white children in these schools would be more like 
the percentage for the entire school system. 


I agree_____ I disagree _______ I am not sure. 


were classified on a post hoc basis according to 
reported race and age as given by the re- 
spondents. The frequencies are summarized 
in Table 3. Clearly, the marginal totals (not 
shown) for each 2 X 2 table are random varl- 
ables. The number of whites and nonwhites was 
unknown at the time of the mailing, and, cer- 
tainly, the number agreeing to the question 
was also an unknown quantity. Thus, the 
simple binomial theory is not applicable. 
Consider the 2 X 2 contingency table for 
the lowest age group. A standard measure of 
interaction or association is given by 


es pupz 

pipa 
with (pu + p + pa + p2) = 1 where pit 
and z are the diagonal probabilities of 
occurrence, and py and poi are the off-diagonal 
probabilities of occurrence. Another equally 
valid measure of association is 


Y1 = loge Ai = loge pu + loge p22 

— loge p12 — loge Par 
Its similarity to the interaction measures in 
a 2 X 2 analysis-of-variance design is obvious- 
The statistical hypothesis of interest for this 
example is 

Ho: ¥1 = ¥2 = Y3 = Yo 

with the alternative hypothesis being that Ho 
is false. 


Ay 
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Normal-curve theory is achieved through 
the maximum likelihood estimates of the A, 
and y+. It is easy to show that the maximum 
likelihood estimate of A, is 


Perper _ Meister 


T Pafin 
and therefore the maximum likelihood estimate 
of ya is 


Nertean 


fa = log. A, = log. men + loge mise 
an loge nk — loge Neo. 


One can also show that the variances of these 
estimates are given by 


OA 1 1 1 1 
Var(A,) = ae] E+ tetet] 
nki Nk Mk Mka 
and 
PE ee 
Neu Nke2 = MK12 nk 


With 6, = 4, and the estimated variances 


TABLE 3 
COMPUTATIONS ror EXAMPLE TIREE 


Age level Low | 
Race w NW 

Number agreeing 135 25 

Number disagreeing | 35 12| 108 9/139 8 
Nu 169 47/210 84/198 48 
fr .27205 —2.17742 —2.46673 
var (94) 14793 14350017414 
Üi 6.7599 6.9686 5.7425 


Note.—Abbreviated: W = white, NW = nonwhite. 


and 


3 an 3 
Uy = SAS Z Wile — 400. 


bet var(ĵs) 


For this example, it should be mentioned that 
U's and Goodman’s Y? are identical test 
statistics. Computations for the test are 
summarized in Table 3. 

For this example 


rhe 6.7599(.27205) + 6.9686(—2.17742) + 5.7425(=2.46637) _ _4 41224 
a 6.7599 + 6.9686 + 5.7425 serra a 


and 


U"s= 6.7599 (.27205+-1.41224)? 
+6.9686(—2.17142-++1.41224)? 
+5.7425 (— 2.46637-+-1.41224)?= 29.64. 


Since three contingency tables are being 
compared, U’y is asymptotically chi-square 
With two degrees of freedom if Ho is true. For 
&= 05, chi-square equals 5.99, and, as a 
result, Ho is rejected. The general form of the 
(1 — a)% set of simultaneous confidence in- 
tervals is given by 


(fi, — Vex) 
+ Varí — a) Vvar (fr) + var(fr); 


for this example, 
WP a) = VAIS) = V599 = 2.45. 


Of the three listed contrasts, two are sig- 
Nificant, 


1.1283 < yı — y2 < 3.7707 Significant 
1.3495 < yı — ys < 4.1273 Significant 
—1.0916 < yz — ys < 1.6695 Not Significant 


These results indicate that the interaction 
measure for the youngest age group differs 
from that for the two older age groups. Within 
the youngest age group, agreement with the 
question is the same for both races, but with 
the higher ages, the nonwhites support the 
question more than the whites. 


EXAMPLE FOUR 


Multiple Comparisons among the Parameters of 
K Independent Normal Populations—Equality 
of Variances Not Assumed 


This example with K = 3 is from the doc- 
toral dissertation of William Rohwer (1964). 
For this study, 96 sixth-grade subjects were 
assigned to three experimental conditions. 
Subjects were expected to learn a list of eight 
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pairs of high-frequency nouns. The two nouns 
in each pair were connected either by a con- 
junction, a verb, or a preposition. One measure 
of learning was the number of correct responses 
in six trials. The results are summarized in 
Table 4. The hypothesis to be tested is the 
usual one-way analysis-of-variance hypothesis 
Ho: pı = p = ps = po 
against the alternative hypothesis that Ho is 
false. 

To test the hypothesis, let 6, = Z+. For any 
value of N,, it is shown in most elementary 
statistics textbooks that 

O: = Ex) = E(@) = m 


Var(6,) = Var(#) = o:2/N2. 


0 


and 
U's = .2921(20.22 — 24.83)* 
+ .2902(26.84 — 24.83) 
+.2112(28.46 — 24.83)? = 10.16. 
Since three experimental conditions are 
being compared, U's is asymptotically chi- 
square with two degrees of freedom if Ho is 
true. For a = .05, chi-square equals 5.99 and 
as a result, Ho is rejected. The general form 


of the (1 — a)% set of simultaneous confidence 
intervals about the contrasts is given by 


(Zs, — Fn.) + VXx-2(1 — a) Sr + Si 
V Nn Ny 
The contrasts are listed: 


—.21 < wi — p < —13.02 Significant 
—15.25 < pı — p3 < — 1.25 Significant 
— 8.62 <pı— p< 5.38 Not Significant 


TABLE 4 
COMPUTATIONS FOR EXAMPLE FOUR 


Noun 


connection | Conjunction 
Ër 20.22 
Nk 32 
Sè 109.56 
var (£;) 3.42 
We 2921 


ee .2921 (20.22) + .2902(26.84) + .2112(28.46) _ 
.2921 + .2902 + .2112 F 


LEONARD A. MARASCUILO 


With this transformation and the estimated 
variances 


3 
=> Wil& — N 
=! 
where 


Zo = rman / > Wi. 


Computations for the test are summarized in 
Table 4. 
For this example 


24.83 


Since the Ns are equal and the variances are 
essentially the same, the F test would normally 
be used. For this example, F = 4.93. The 
result is significant with a = .05 since F2,93(-95) 
= 3.11. According to the Schefié method of 
multiple contrasts, one simple difference is 
significant. The contrasts based on the Scheffé 
method are listed: 


32 < pı — p < —13.56 Not Significant 
—15.18 < pı — p < — 1.30 Significant 
— 8.56 <pa— ps < 5.32 Not Significant 


Comparing these intervals with those based 
on the chi-square distribution, it is seen that 
the two sets are essentially the same. How- 
ever, while the difference between pı and p: 
is not significant with the Scheffé method, 
this difference is significant with the large 
sample chi-square intervals. The reason for 
this is that the sample variances S? and Si’ 
are both less than the Mean Square error 
(123.79) which is used in the Scheffé intervals. 

Note that for this example, the large-sample 
chi-square coefficient is equal to Vx:2(.95) 
= 2.45 while the corresponding coefficient for 
the Scheffé intervals is given by 


V2Fo93(.95) = V2(3.11) = 2.49. 


This near equality in coefficients is not suf 


ee oe 
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prising since 
(K — 1)Fx-1,.. = Xx-1*. 


It should be mentioned that U’s is amenable 
to small samples with unequal variances and 
unequal sample sizes. For this, U's is multi- 
plied by a constant and a new set of degrees 
of freedom must be calculated. Since the 
sample sizes are large and the variances are 
nearly equal, such a correction is not needed. 
For this example, the correction would reduce 
U’» to 10.05 with no change in the final 
decision. This correction procedure is described 
in considerable detail by C. C. Li (1964, pp. 
436-438). Interested readers may refer to it. 


EXAMPLE FIVE 


Multiple Comparisons among the Differences 
between K Sets of Paired Normal Populations 
—Analysis of Variance 


Interaction test for a 2 X K design, equality 
of variances not assumed. This example with 
K = 3 is from the same study as Example 4. 
In addition to varying the parts of speech 
between the paired associates, the learning 
materials were presented in normal English 
Syntax ordering for one-half of the subjects; 
while with the remaining subjects, the normal 
English syntax ordering was scrambled. The 
results are summarized in Table 5. The hy- 
Pothesis to be tested is the usual analysis-of- 
variance hypothesis that the two-factor inter- 
actions are all zero. The statistical hypothesis is 


Ho: (mu — m) = (u21 — u22) 
= (us — pa) = (m — m) 
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with the alternative hypothesis being that He 
is false. To test this hypothesis, let Oy = 2i 
— 2 = da. As shown in most elementary 
statistics textbooks, 


Or = E(dy) = E(2u1 — 22) = pn — Bee 
ow 


Var(6,) = Var (ĉn - Žr) = Na + ad, 


With this transformation and the estimated 
variances 


5 3 (dy Fea dy)? g 
Un = 2 EESE Te = x Waldi — do)? 
with 


do = È Wadi f £ We 


Computations for this test are summarized in 
Table 5. 
For this example 


_ 0763 (4.32) +.0930 (9.81) +.0891 (15.69) 


d= .0763+.0930+-.0891 
= 10.22 


and 
U's = .0763(4.32 — 10.22)? 


+ .0930(9.81 — 10.22)? 
+ .0891(15.69 — 10.22)? = 5.34. 


Since three sets of mean differences are 
being compared, U’s is asymptotically chi- 
square with two degrees of freedom if Ho is 


TABLE 5 
COMPUTATIONS FOR EXAMPLE FIVE 


Noun connection 


Conjunction 
————— 
Syntax level Normal Scrambled 
By 22.38 18.06 
Sy? 132.71 77.09 
Nii 16 16 
Var (ri) 8.29 4.82 ; . $ 
dy 432 . ; 
var (d;) 13.11 . ‘ 
w, 0763 f 
Se re a S 
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true. In order to compare the confidence 
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about the contrasts is given by 


intervals with the Scheffé intervals, leta = 10. (gy _ d 

F 10, chi akai ee, 

For a = .10, chi-square eq 60, 2 ro ee ee 
a result, Ho is rejected. The general form of VXq(1 — a) Vvar(dss) + var(da). 
the set of simultaneous confidence intervals The contrasts are listed: 


—15.99 < (un p> pi) ge (un = pa) < 501 Not Significant 
—21.97 < (pn — p) — (uz — ua) < —.77 Significant 
—15.98 < (un — wee) — (us — u32) < 4.22 Not Significant 
Since the Ns are equal and the variances are essentially the same, the F test would normally 
be used. For this example, F=2.77. The difference is significant with a=.10, since F,90(.90) 
= 2.37. According to the Scheffé method of multiple contrasts, the interaction term involving 


the conjunctions and verbs is statistically significant. The contrasts based on the Scheffé 
method are listed: 


—16.01 < (un — wie) — (pn — uz) < 5.03 Not Significant 
—21.89 < (un — wie) — (usr — wae) < —.85 Significant 
—16.40 < (un — uz) — (us1 — pw») < 4.64 Not Significant 


If the conditions for the analysis-of-variance 
model are not satisfied, then it is recommended 
that the proposed procedures be used. As to 
whether or not the adjustments of C. C. Li 
(1964) apply to the small-sample counterpart 
of this example with unequal variances is not 
known for sure by the present author. How- 
ever, one would think that the adjustments 
would be applicable. 

Note that, for this example, the large-sample 
chi-square coefficient is equal to 


Vx2(.90) = V4.60 = 2.14 


while the corresponding coefficient for the 
Scheffé intervals is given by 


V2F2,90(.90) = V2(2.37) = 2.18. 


Again the coefficients are nearly equal because 
the F distribution divided by its numerator 
degrees of freedom converges to the chi-square 
distribution as the denominator degrees of 
freedom increases. 

Clearly, these examples do not exhaust all 
the possibilities that may arise in practice. 
The only requirements for the application of 
these techniques are large, independent, ran- 
dom samples with efficient estimates for the 
parameters. Large-sample theory guarantees 


that their joint density will be multivariate 
normal so that both the test statistic and the 
simultaneous confidence intervals may be em- 
ployed with any confidence coefficient that 
one might use. 

Derivation of the test statistic. Consider K 
independent random variables X+ with proba- 
bility distributions f(x. ©:), k=1, 2, 3 
++, K, from which independent random 
samples of size Ni, No,---, Nx (all large) 
have been selected. Let 6:, 62,---,Ox be & 
set of independent asymptotically, normally, 
distributed estimators for these parameters. 
It is known from large-sample theory that 
their joint density function has an asymptotic 
chi-square distribution with K degrees of 
freedom since it is the exponent in the asymp- 
totic K-variate normal distribution of 01, 92 
«++, Ox. (See Mood & Graybill, 1963, p. 264) 

Since the probability distributions are 
statistically independent Cov (+1, 6:2) = 0 for 
kı Æ kə. As a result 


ye $ Oo 
k=l Var (6x) 
To test the hypothesis 


Hy: 0; = O, =---= Ox = Oo | 
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it is only necessary to evaluate U under Hy 
and determine whether or not U > Xx2(1 — a). 
If U is too large, Ho is rejected. 

For most applications, the exact value of 
ə is unknown and must be estimated. An 
easy-to-obtain estimate is the one that mini- 
mizes U. This estimate is given by 


oe >! 
b- > rey ot/ Avat 
= E Wô / È w. 
k=l k=l 


If this estimate is substituted into U and if the 
resulting expression is denoted as Us, it follows 
that 


K ip Â K 
sci (Ər — 8o)? _ a e. 2 
v= X -ar00 T 2 C- ôo) 


where W, = (Var(ô})). Employing familiar 
analysis-of-variance methods, it is easy to show 
that Uo is asymptotically chi-square with 
(K — 1) degrees of freedom. To show this, 


add and subtract 69 in U and expand the 
resulting binomial. This gives 


K a 2 
U= È WiL(©x — Oo) + (60 — Oo) F 
k=l 
£ “ a A x 
= >) Wi. (Ox — Oo)? + (Oo — 60)? Z W: 
k=l k=l 


= U+ U.. 


It is easy to show that the 


EW k ÈW: 
k=1 k=l 
and as a result 
a K A oy 
U= (80 — ©)? & W = Sree 
k=l Var (Oo) 


Since U; is the square of a variable with an 
*Xpectation of zero and a variance of one, it 
4S an asymptotic chi-square distribution with 
one degree of freedom. In the limit, Uo and U1 


are asymptotically independent, and because 
of the additive property of the chi-square dis- 
tribution, U, is asymptotic chi-square with 
(K —1) degrees of freedom. Therefore, a 
simple decision rule that may be used for test- 
ing Ho is to reject Ho if Uy > Xx_#(1 — a) 
and to not reject Ho if Uo < Xx_#(1 — a). 

If the variances are unknown and the sample 
sizes are large one can substitute the large- 
sample estimates of the variances into the 
final result with little loss. This also applies 


to the estimate of 6, which would then be 
equal to 
“Se al 


ò ajo /E 1 
o= Z aO '/ E a 


In addition, the test statistic would be 
K 
U's = je W, (Ô: ba ô). 
k=l 


A particularly interesting set of contrasts is 


ES 
the following q ) contrasts 


yı = (01 — 02), Y2 = (01 — Os), 


-+t Yra- = Ox — Ox-). 
= 


A (1— a)% confidence set for these differ- 
ences is 


Ox, T Ôr) 


+ VXx_2(1 — a) Vyar (Ôn) + var (Ôn). 


Note that if K = 2, VX} (1 — a) = Z(1 —a/2) 
where Z is N(0, 1), one obtains the well- 
known 


(6: — 62)  Z(1 — a/2)Vvar (1) + var»). 
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REACTION TIME AS A MEASURE OF 
PERCEPTUAL VIGILANCE ' 


LESLIE BUCK 
Industrial Psychology Research Unit? University College London 


This article reviews the use made of reaction time as an index of performance 
deterioration in monitoring tasks, with special reference to the hypothesis 
that reaction time and detection rate are correlated indices of perceptual 
vigilance. It is concluded that this is the case, and a theoretical model relating 
the 2 indices to changes in vigilance occurring with time on task is proposed. 


Perceptual vigilance or attentiveness is a 
hypothetical construct inferred from the ob- 
servation that monitoring performance dete- 
riorates under prolonged monotonous con- 
ditions, other factors remaining constant. 
Among the performance indices that may be 
used to detect this effect is the time elapsing 
between the presentation of a signal and the 
response indicating that the subject has de- 
tected it. This time interval commonly in- 
cludes the time taken to search the display and 
to detect that a critical event (the signal) 
has occurred as well as the time taken to 
evaluate the event and to choose and make 
the response, and as such it differs from re- 
action time as it has been defined and studied 
in classical reaction-time experiments. For 
this reason, these time intervals are gen- 
erally longer than those reported for classical 
time experiments, and some experimenters 
have preferred to designate them detection 
latencies. In this article, however, the term 
reaction time is used. 

Reaction time to a signal is determined by 
a number of factors of which vigilance level 
can be regarded as only one, and, in view of 
its hypothetical status, is demonstrable only 
when all others have been eliminated by hold- 
ing them constant. This means as a matter of 
perimental procedure that the effect must 

demonstrated by repeating the signal so 

t the first presentation acts as a control 
for the second, and the two presentations 
differ only in respect of temporal succession. 


a? Paper was prepared under the supervision of 
. ©. Drew for submission as part of a thesis for a 
Postgraduate degree in the University of London. 
tai: Unit is now at the National Institute of In- 
trial Psychology, London. 


A change in reaction time can then be ascribed 
to the change in vigilance which, it is postu- 
lated, has occurred during the intersignal in- 
terval. Thus, although in theory it may be 
postulated that overall mean reaction time is 
a function of overall vigilance level, in practice 
it is the changes in vigilance level with time 
on task which are detected under the experi- 
mental procedure. It is the reaction time in- 
crement, or the rate of change of reaction 
time, which is therefore the significant de- 
pendent variable. Other procedural and meas- 
ural possibilities may arise when some vari- 
able other than time on task, such as a 
physiological variable, is shown to be a 
reliable objective index of perceptual vigi- 
lance level. 

One consequence of this relates to the 
interpretation to be placed upon the analysis 
of variance of mean or median reaction times 
for the successive periods of the experimental 
session, which is the test of statistical signifi- 
cance commonly applied to reaction-time data 
in a vigilance experiment. A significant period 
variance can be taken as evidence of deteriora- 
tion in performance, and a significant Period 
X Condition interaction variance as evidence 
that deterioration differed among conditions. 
A significant condition variance, on the other 
hand, cannot be taken as evidence that vigi- 
lance level differed among conditions, since 
any such difference may be attributed to the 
difference between conditions as such without 
invoking vigilance level as an intervening 
variable. A second consequence is the de- 
sirability of converting the original reaction- 
time data into their logarithmic equivalents 
so that increments become rates of growth, 
and comparisons can be made without refer- 
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ence to initial or overall reaction-time levels. 
This conversion is desirable anyway, in view 
of the typically skewed distributions of re- 
action-time data, if parametric statistical 
tests are to be applied. 

A survey of the literature showed that the 
performance index used by most investigators 
has been detection-rate decrement rather than 
reaction-time increment. This preference 
stems no doubt from the fact that the prac- 
tical problems which stimulated recent work 
in this area concerned the detection of in- 
frequent, transient signals, where the im- 
portant question is the probability of whether 
each signal will be detected—which may be 
inferred from detection rate—rather than 
marginal differences in reaction time. Further- 
more, when the characteristics of the real-life 
signals are simulated in the laboratory, a 
measureable detection-rate decrement is ob- 
tained, so that this measure is sensitive as 
well as appropriate. In other circumstances, 
however, as when ease of detection is a char- 
acteristic of the real-life situation or is theo- 
retically desirable in order that the subject 
should be aware of the temporal or spatial 
pattern of the signals, detection rate may be 
uniformly high and insensitive to changes in 
performance within the time limits imposed 
by the experimental facilities, and, in that 
case, reaction time must be used. 

The empirical observations that under ap- 
propriate conditions reaction time and detec- 
tion rate are both variable with time on task 
have led to the assumption that the two 
measures are indices of the same hypothetical 
construct. When variations of an independent 
factor have been shown to affect reaction- 
time increment in one experiment and detec- 
tion-rate decrement in another experiment, 
the results have been collated in terms of the 
relation between the common independent 
factor and changes in vigilance. Such treat- 
ment of the data presupposes that compara- 
ble variations in the independent factor have 
concomitant effects upon reaction-time in- 
crement and detection-rate decrement. Con- 
sideration of how far this is, in fact, the 
case provides one means of testing the validity 
of the basic assumption. A further means lies 
in the deduction that in those cases where 
both reaction time and detection rate are 
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sensitive to performance deterioration, the 
two indices should covary inversely with re- 
spect to time on task. A third method is to 
examine the evidence that reaction time and 
detection rate correlate in opposite directions 
with the variables which have been postulated 
as the physiological correlates of perceptual 
vigilance level. 

Apart from its theoretical interest, this 
question has important practical implications 
for applied psychology. When problems re- 
lated to industrial monitoring tasks are 
studied by means of simulated experiments, 
it may not be possible in practice both to 
simulate the display characteristics of the 
task and to obtain significant variations in 
the appropriate performance measure. In the 
task of driving a train, for example, the signal 
display is characterized by ease of detection, 
and the important performance measure is 
detection rate, which should be maintained 
at 100%. (It is not important that a train 
driver should respond as fast as possible to 
every signal, nor that his reaction time should 
remain uniformly low: in normal circum- 
stances, the driver may make his response mM 
his own time and in most cases no response 
will be required anyway.) If the task is 
simulated in respect to ease of detection, it is 
difficult within the scope of normal experi- 
mental facilities to obtain statistically sig- 
nificant detection-rate data: in general, no 
signals will be missed. In such circumstances, 
it may be necessary to measure performance 
deterioration by the sensitive but strictly 
inappropriate index of reaction time, and the 
validity of applying conclusions based upon 
such experimental data to the real-life situa- 
tion depends on the implied relation between 
reaction time and detection rate being a real 
one. 


Factors Affecting Reaction-Time Increment 


Under this heading the relations betwee? 
reaction-time increment and various expel! 
mental variables are discussed and compared 
with data relating to detection-rate decrement. 

Rest pauses and repeated sessions. Uninter- 
rupted prolongation of the test session 15 a 
basic requirement for demonstrating @ de- 
terioration of monitoring performance. Even # 
short pause in the otherwise uninterrupted 
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course of the session is sufficient to prevent 
the appearance of deterioration or to restore 
performance to a previous level. McCormack 
(1958) presented a light signal at irregular 
intervals and found that reaction time in- 
creased during a 40-minute session. After 
interruptions of 5 and 10 minutes in two 
separate groups of subjects, reaction time was 
lower and, in the latter case, was almost back 
to the original level. Jenkins (1958) used a 
more complex double task in which subjects 
had to detect exceptional deflections of an 
oscillating voltmeter pointer and to respond 
to a peripheral light signal. Reaction times to 
the light signal increased with time on task 
in an uninterrupted session of 90 minutes, but 
this was not the case when subjects were 
interrupted for 30 seconds every 5 minutes. 
The effect was the same whether the subject 
remained in or moved out of the test cubicle 
during the rest period. 

Jenkins’ experiment showed that the. rest 
pauses had a comparable effect on pointer- 
signal detection rate. Other experimenters who 
have prevented the appearance of detection- 
rate decrement in this manner are Mackworth 
(1950) who interrupted subjects for 30 min- 
utes between two 30-minute periods, and 
Colquhoun (1959) and Bergum and Lehr 
(1962) who interrupted for 10 minutes be- 
tween 30-minute periods. Adams (1956) found 
that the detection-rate decrement in a 110- 
minute session was followed by a significant 
improvement after a 10-minute rest pause. 

The effect of an interruption is to initiate 
a period of recovery of vigilance, and Mc- 
Cormack’s data suggest that recovery be- 
comes more complete as the pause is extended. 
Experimenters who have used the same subjects 
in separate sessions of the task have in effect 
given very extended rest and have found that 
although performance in the second session 
has sometimes been better overall than in the 
first, presumably as a result of practice, the 
deterioration in performance has been much 
the same. Mackworth (1950) found no differ- 
ence in reaction-time increment in two 2-hour 
Sessions of a synthetic radar test repeated at 
an interval of about a week. Webb and 
Wherry (1960) tested three subjects on an 
auditory monitoring task in which signals 
Were 3-second variations in the pitch of a 


continuous tone and found that reaction-time 
increments in 9-hour sessions were the same 
on 5 successive days. McCormack (1960) 
reported that there was no significant replica- 
tion effect when he repeated his flashing-light 
test at an interval of 1 week. Adams, Humes, 
and Stenson (1962) required their subjects 
to monitor a screen showing six moving 
alphanumerical symbols for short-duration 
changes in code. The task was repeated in 
3-hour sessions on 9 successive days, and al- 
though there was a tendency for initial re- 
action time to decrease from day to day, 
neither the Replication X Period interaction 
variance nor the replication variance was sig- 
nificant- The reaction-time increment in a 
tenth session 7 days later was the same as that 
for the ninth session. 

Mackworth’s synthetic-radar experiment 
gave the same result in respect to detection- 
rate decrement. Pollack and Knaff (1958) 
gave a complete replication of a voltmeter de- 
tection task and reported that analysis of 
variance of detection rates yielded some sig- 
nificant factors, including the period factor, 
but the replication factor and the Replication 
X Period interaction were not among them. 
Thus the results for both indices indicate 
that performance deterioration is not offset by 
experience in the task, and that subjects can, 
in certain circumstances, be used as their own 
controls in assessing the effect of an experi- 
mental variable upon vigilance decrement. 

Some of the experimenters cited in this 
connection stated explicitly that the sessions 
were repeated at the same time of day. Jenkins 
studied the effect of repeating a morning 
session in the afternoon with a break of about 
2 hours between them. At the beginning of 
the afternoon session, light-signal reaction 
time and pointer-signal detection rate were 
improved compared to the end of the morning 
session, but subsequent deterioration of both 
indices was much more pronounced. Wilkinson 
(1961) reported a morning versus afternoon 
effect upon overall detection rate, but the 
differences in respect to decrement were not 
quite significant (and were in the opposite 
direction to those of Jenkins). Wilkinson, 
however, used separate groups of subjects. 

Signal detectability. Some reviewers (eg; 
McGrath et al., 1959; Broadbent, 1964) 
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specified low signal intensity, along with in- 
frequency and irregularity, as being char- 
acteristic of monitoring tasks in which per- 
formance deterioration is found. Formal evi- 
dence of a relation between detection-rate 
decrement and signal intensity is hardly re- 
quired in view of the general experience of 
experimenters in this field, although Mack- 
worth (1950) (bright versus dim radar sig- 
nals) and Adams (1956) may be cited. 
(Adams actually reported no significance dif- 
ference in decrement between two levels of 
signal intensity and two levels of signal dura- 
tion, but inspection of his data suggests that 
the two extreme combinations might be sig- 
nificantly different.) There is no experimental 
evidence dealing with the relation between 
reaction-time increment and signal intensity 
in a visual task. 

Experiments in nonvisual modalities have 
been carried out by Loeb and his co-workers 
using both reaction time and detection rate 
as indices of performance, and cross-reference 
comparisons elucidate their relation with sig- 
nal detectability. With white-noise signals set 
at intensities of 15 db. and 20 db. above the 
subject’s threshold and presented irregularly, 
they obtained different mean reaction times 
but no increment in either case. A longer rise 
and decay time produced a longer mean re- 
action time but again no increment. Detection 
rates were in all cases uniformly high and 
close to 100% (Hawkes & Loeb, 1961; Loeb 
& Hawkes, 1961). In another experiment, 
using tones set at 60 db. and 10 db., no re- 
action-time increment nor detection-rate de- 
crement was found for the former signal in- 
tensity, but deterioration on both indices was 
found for the quieter signals and overall 
performance was poorer (Loeb & Schmidt, 
1963). The tones were presented at a higher 
rate than the noise signals, but, on other 
evidence, this should have improved perform- 
ance and does not account for the effect. 
Hawkes and Loeb (1962), using a lower sig- 
nal rate than for any of their previous experi- 
ments, obtained no deterioration with 15 db. 
signals. There appears to be, therefore, a 
critical level between 10 db. and 15 db. below 
and above which auditory reaction-time incre- 
ment and detection-rate decrement will and 
will not be found. 
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In some instances these experiments in- 
cluded parallel conditions in which the audi- 
tory signals were replaced by cutaneous elec- 
trical signals of subjectively equal intensity, 
and these data provide similar results, al- 
though an inconsistency must be noted. No 
performance deterioration was found with 
signals set at 5.1 db. above threshold nor 
with 1.6 db. signals presented to male sub- 
jects. Female subjects produced a reaction- 
time increment but no detection-rate decre- 
ment with 1.6 db. signals. With 1.2 db. signals, 
there was a reaction-time increment and a 
detection-rate decrement from both male and 
female subjects. To this extent, therefore, 
performance deterioration depends upon sig- 
nal intensity, the critical level varying with 
the sex of the subject. Hawkes and Loeb 
(1962) reported, however, that they found a 
detection-rate decrement but no reaction-time 
increment using signals of 1.2 db. intensity. 

In monitoring tasks where signals must be 
discriminated from background stimuli, rela- 
tive rather than absolute intensity is presum- 
ably the operative factor. There is no direct 
evidence that this affects deterioration in 
performance, but three studies in which au- 
ditory signals had to be discriminated from 
background stimuli of different duration pro- 
vide indirect evidence. The duration ratios 
were 9:8, 3:4, and 2:1. Reaction-time incre- 
ment and detection-rate decrement were found 
in the first case, detection-rate decrement in 
the second (reaction times were not reported), 
and neither reaction-time increment nor detec- 
tion-rate decrement in the third (Loeb & 
Hawkes, 1962; Mackworth, 1950; Wilkinson, 
1964). It seems reasonable to conclude from 
this evidence that variations in signal detect- 
ability have concomitant effects upon the two 
performance indices, that is, as detectability 
increases, deterioration rate decreases. 

Signal rate and regularity. Experimentets 
generally insure that the periods into which 
they arbitrarily divide their test sessions con- 
tain equal numbers of signals presented at 
the same intervals on the assumption that 
these factors affect performance deterioration. 
Experimental validation of this assumption 
includes data from studies using the artificial- 
signal technique whereby performance in re- 
spect to real signals is improved by the ad- 
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dition of simulated signals into the system. 
In the laboratory, real and artificial signals 
can be made indistinguishable, and the varia- 
tion is effectively one of signal rate. Garvey, 
Taylor, and Newlin (1959) used a display of 
eight voltmeters, and subjects watched for 
deflections of 60°. When real signals were 
presented at a rate, which varied between 
subjects, of between 1 and 4 signals in a 
2-hour session, reaction-time increased and 
detection-rate decreased with time on task; 
but when 96 artificial signals were added 
there was no performance deterioration in 
respect to the arbitrarily-defined real signals. 
This was equally the case when the artificial 
signals were distinguishable from the real 
(deflections in the opposite direction). Using 
a display of 3 voltmeters, Faulkner (1962) 
found that the addition of 27 artificial signals 
reduced the tendency for reaction times to 
9 real signals to become more variable with 
time on task. In neither condition did detec- 
tion rate decrease (signals remained on dis- 
play until detected), nor was there a sig- 
nificant reaction-time increment, which is 
hardly surprising in view of the brevity of the 
session (27 minutes). 

Baker (1960) found that detection rate 
for real signals was higher and decrement was 
smaller when indistinguishable artificial sig- 
nals were added to the display, but he gave 
knowledge of results (signals correctly de- 
tected, missed, and falsely reported) to the 
experimental group and not to the control 
Stoup. Wilkinson (1964) showed that with- 
Sut the knowledge of results of artificial- 
Signal detection, the effectiveness of the tech- 
nique in terms of real-signal detection rate is 
questionable. This is contrary to the results of 
Garvey et al, (1959) who found that explicit 
knowledge of results (signal not responded to 
within a given time) was not essential for a 
“nificant reduction in reaction-time in- 
rement, In their experiment, however, the 
Signal seems to have been much more easily 
discriminated from background stimuli so 

at the subject may have been much more 
certain that he had in fact correctly detected 
` Signal. Furthermore, the response cancelled 

© signal, and the return of the pointer to 
fe null position may have provided a form of 

Owledge of results. 
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Experimental data relating to all of the sig- 
nals in the display were provided by Adams, 
Stenson, and Humes (1961) who used the 
same task as Adams et al. (1962) under 
experimental conditions which differed pri- 
marily in respect to signal rate. In the first 
experiment, which employed 12 signals per 
hour, mean reaction time increased from about 
2 to 3 seconds over a 3-hour session, In the 
first criterion session of the second experiment, 
which employed 45 signals per hour, the in- 
crease was from about 1.25 to 1.75 seconds. 
These values have been taken from the pub- 
lished graphs, but they suggest that reaction- 
time increment was significantly smaller for 
the higher signal rate. 

Comparable results in respect to detection 
rate were obtained by Jenkins (1958) on his 
voltmeter—peripheral-light task. He found that 
pointer signal detection-rate decrement de- 
creased as signal rate increased in the range 
of 7.5-480 signals per hour. Kappauf and 
Powe (1959) demonstrated the same relation 
in the range 8-80 signals per hour and Nicely 
and Miller (1957) found it in respect to sig- 
nal rates of 12 and 72 signals per hour ap- 
plied to two parts of the same display. Deese 
(1955) reported a relation between signal rate 
and detection rate but did not give details of 
decrements. Other experimenters reported null 
results but these may be accounted for in 
terms of concomitant variations of other fac- 
tors. Thus, Ellis and Ahr (1960) found no 
decrements for signal rates ranging from 1.2 
to 24 errors per page in a proof-reading task, 
but the task was unpaced and less than 50 
minutes in length. Bergum and Lehr (1962) 
suggested that they failed to obtain an effect 
because the range of signal rates (6-24 signals 
per hour) was too narrow. York (1962) re- 
ported a null relation between signal-rate and 
detection-rate decrement but no details were 
given. 

Variations in signal rate are correlated with 
variations in mean intersignal interval and 
may be accompanied by variations in inter- 
signal-interval variability. Mean intersignal 
interval might conceivably be varied while 
signal rate is held constant by varying signal 
duration, but such an experiment does not 
appear to have been done. On the other hand, 
there have been experiments in which signal 
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rate and mean intersignal interval have been 
held constant while intersignal variability 
(signal regularity) has been varied. Mc- 
Cormack and Prysiazniuk (1961) presented 
the flashing-light test at three levels of vari- 
ability—constant 60-second intervals, 30(15) 
90 seconds (that is, 30-, 45-, 60-, 75- and 90- 
second intervals randomly in equal numbers) 
and 10(25)110 seconds—and showed by anal- 
ysis of variance that although the period and 
variability factors were significant, the Period 
X Variability interaction was not. Dardano 
(1962) used an oscilloscope display on which 
sine waves of 5-second duration were re- 
peated at 10-second intervals. The signal, 
which was brighter, of greater amplitude, and 
remained until responded to, was presented at 
three levels of variability: 50(5)70 seconds, 
30(15)90 seconds and 10(25)110 seconds. 
Reaction-time increment increased as signal- 
irregularity increased, but the differences were 
nonsignificant due, it seems, to high subject 
variability. Only six subjects were tested in 
each condition. Boulter and Adams (1963) 
criticized the previous experimenters on the 
grounds that the subjects were given no op- 
portunity to learn the degree of variability 
before being tested, and, in the former case, 
the same subjects were used for both con- 
ditions. They presented the signal “20” on a 
digital instrument at three levels of variability 
—constant 220 seconds, three intervals in the 
range 120-270 seconds, and eight intervals in 
the range 15-900 seconds—and gave separate 
groups of subjects practice and criterion ses- 
sions on successive days, but they also failed 
to find a significant difference in reaction- 
time increment. In a subsequent experiment, 
Adams and Boulter (1964) increased the dis- 
play to three digital instruments and used two 
levels of variability—constant 195 seconds, 
and nine intervals in the range 15—438 seconds 
—and associated each with two levels of 
source uncertainty. They obtained a signifi- 
cant reaction-time increment in a criterion 
session for the variable-interval condition as- 
sociated with the predictable-source condition. 
The period variances for the corresponding 
constant-interval condition (and for the two 
other conditions) were nonsignificant. 

These experiments show that increased sig- 
nal irregularity produces increased reaction- 
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time increment, but it appears that the varia- 
tion in the independent variable must be quite 
considerable before a significant effect can be 
detected. With respect to detection rate, Baker 
found a similar result in that a decrement was 
obtained in a 1-hour session using intersignal 
intervals in the range 45-600 seconds, but 
not for intervals in the range 60-360, nor for 
a constant interval of 150 seconds (cited by 
McGrath et al., 1959; see also Baker, 1959), 

Spatial characteristics. The spatial char- 
acteristics of the display may result in un- 
certainty as to where the next signal will 
appear if and when it appears, in the same 
way as the temporal characteristics lead to 
temporal uncertainty. Spatial uncertainty may 
be varied in a number of ways: by varying the 
number of signal sources, by varying the 
dispersion of the sources or the area of the 
perceptual field, and by varying the sequential 
probability with which appearance of a signal 
at one source or part of the field will be fol- 
lowed by appearance at another source oF 
part. 

Adams et al. (1961) used their alpha- 
numerical-code task with two levels of sources, 
that is, with 6 or 36 symbols presented at one 
time. Although analysis of variance of the 
reaction times showed that the period and 
source-load factors were significant, the Period 
X Source Load interaction was nonsignificant. 
Jerison (1963) found a contrary result in 
respect to detection rate when he increased 
the number of signal sources from one to 
three: detection rate was lower for the three- 
clock condition but it did not decrease with 
time on task as it did with one-clock con- 
dition. However, signal rate was varied con- 
comitantly, and, furthermore, a choice response 
was required with the three clocks. On the 
other hand, Mackworth (1963) doubled the 
number of sources at which a signal might 
appear in a one-clock test by presenting S!& 
nals at all 20 sectors of the circumference 
instead of only every other one. Detection 
rate was lower in the 20-sector condition, but 
the decrement was the same as for the 10- 
sector condition. However, when she compare 
a whole-circumference condition with an 83%- 
circumference condition, analysis of variance 
of detection rates yielded a significant Period 
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x Condition interaction. The evidence must 
therefore be judged inconclusive. 

In the display used by Adams et al. (1962), 
the symbols moved randomly with respect to 
each other so that they may have been dis- 
persed over a larger or smaller area at the time 
when the signal was presented. An index of 
separation was defined in terms of the ratio 
between the distance of the signal symbol 
from each of the other symbols and the mean 
intersymbol distance. It was found that 
whereas reaction times for signals associated 
with high and low ratios were not significantly 
different at the beginning of the session, by 
the end of the session those associated with 
isolated signals were significantly greater. 
This implies that reaction-time increment is 
greater when a given number of signal sources 
is more dispersed (Adams et al., 1962). 
Adams and Boulter (1962), on the other 
hand, found that when the angular span of a 
display of four digital-instruments was in- 
creased from 18° to 144°, the overall mean 
reaction time increased but the increment 
remained the same. In this experiment, dis- 
persion remained constant for the session, 
whereas in the previous experiment, it was 
continuously varying. 

Adams and Boulter (1962) set cue lights 
above their four instruments and flashed them 
in random order at 2-second intervals so that 
when a signal appeared, it appeared below the 
cue light last flashed. Thus spatial uncertainty 
Was transposed to the cue lights with the effect 
that overall mean reaction time was reduced, 
but the increment remained the same. In a 
second condition, the cue lights were flashed 
in a regular order (for example, in repetitive 
Sequences from left to right) so that sequential 
Probabilities were maximized. This effected a 
further reduction in overall mean reaction 
time but no change in increment. The random 
Versus regular sequences were applied to the 
struments themselves—that is, in the regular 
Sequence the subject knew at which instrument 
the signal would be presented—and each con- 
dition was associated with two levels of 
temporal regularity as described previously 
(Adams & Boulter, 1964). Analysis of vari- 
ance produced nonsignificant period variances 
or both conditions involving random se- 
(ences and for one condition of regular 


297 


sequence. A significant reaction-time incre- 
ment was found with the regular-sequence/ 
temporal-irregularity condition. These data 
show that overall reaction time can be re- 
duced by information about where the next 
signal will be presented, but it does not sup- 
port the contention that reaction-time in- 
crement can be reduced in this way. Baker 
(1958) found similarly that detection rate 
could be increased by indicating at which 
part of the display the next signal would (or 
supposedly would) appear when it appeared, 
but he did not give details of decrements. 
Choice responses. Most experimenters have 
required their subjects to make a simple re- 
sponse to a signal in the manner of Donder’s 
a-reaction or c-reaction. The evidence con- 
cerning the effect upon reaction-time incre- 
ment of requiring the subject to make a choice 
response to more than one category of signal 
is conflicting. When Adams et al. (1961) re- 
quired a four-choice response according to 
whether the signal appeared above or below 
the horizontal axis of the display and the 
numerical code was odd or even, overall mean 
reaction time was increased but there was no 
increment. When subjects made a simple re- 
sponse to the same signals, a significant incre- 
ment was found. Adams and Boulter (1962), 
on the other hand, varied response com- 
plexity by instructing their subjects to make 
a simple response to the detection of a sig- 
nal and to follow it with a six-choice response 
based upon the signal previously displayed at 
that source. The intention was that the sub- 
jects should memorize a changing display. 
Overall mean reaction time was longer as a 
result of this, which indicates that subjects 
made their choice before making the initial 
simple response. There was, however, a signifi- 
cant reaction-time increment in both this con- 
dition and the control condition (simple re- 
action), and the Period x Condition inter- 
action variance was nonsignificant. The only 
experimental evidence available on detection 
rate agrees with the findings of the latter 
rather than the former experiment. Whitten- 
burg, Ross, and Andrews (1956) found, using 
a clock test, that the decrement in double- 
jump signal-detection rate which occurred 
when the subject had to respond differentially 


298 


to both signal and double jumps was the same 
as when they responded to double jumps only. 

Concurrent tasks. The monitor may be re- 
quired to detect signals while he is concur- 
rently performing a second task, and it has 
been shown that the characteristics of this 
second task affect performance of signal de- 
tection. Jenkins (1958) found that reaction- 
time increment for peripheral light signals 
was affected by signal rate in a concurrent 
voltmeter-monitoring task. The increment as- 
sociated with the highest voltmeter signal 
rate of 480 signals per hour was generally 
lower than increments associated with 7.5, 
30, and 60 signals per hour. Aseyev (1960) 
measured simple reaction time to 50 light sig- 
nals presented at short irregular intervals 
and tested two groups of factory workers 
before work, during the lunch break, during 
the afternoon break, and after work. He found 
that among workers employed on monotonous 
conveyor-belt work, mean and standard devia- 
tion reaction time increased through the four 
test periods, whereas they remained stable 
among workers employed on free loading work. 
Haider (1963) related monitoring perform- 
ance to the nature of industrial work, but in 
this case the vigilance test went on while the 
subjects were doing their normal work. A 
lamp mounted on a headband was lit for .5 
second in an irregular schedule repeated 
every hour, and the subject responded by 
pressing a pedal. For subjects who did monot- 
onous work at a conveyer belt or packing 
machine on a continuous early morning shift, 
it was found that reaction time increased and 
detection rate decreased through the shift with 
a slight improvement in performance towards 
the end. For a second group of workers doing 
monotonous work on a day shift with a meal 
break, similar results were obtained. There 
was a slight improvement in detection rate 
after lunch, but reaction time increased 
through the day. A third group of subjects 
did spool-winding work which demanded a 
great deal of attention, and in their case 
reaction time and detection rate for the vigi- 
lance task did not correlate with hours of 
work. 

Both Aseyev and Haider account for their 
results in terms of the generally inhibiting 
effect of monotonous work, although the latter 
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pointed out that in some instances interesting 
work may have the same effect because it is 
distracting. Apart from Haider’s data, the 
effect of monotony upon detection rate was 
shown by McGrath (1963) in that perform- 
ance on a visual-discrimination task was im- 
proved and decrement was reduced by giving 
the subject extraneous stimulation in the 
form of variable meaningful noise in place of 
white noise at the same intensity. The subject 
was not required to respond to the auditory 
stimulation, but it was implied that he lis 
tened to it without overt response. The effect 
was not demonstrated, however, in a subst 
quent experiment of more complex design, nor 
in an experiment with the sensory modes re- 
versed. In this latter case, the subject had 
greater control over the concurrent visual task 
and apparently chose not to attend to it in 
the latter part of the session. 

Subject motivation. The results of vigilance 
experiments are normally interpreted on the 
assumption that the subjects are endeavoring 
to comply with the instruction to detect every 
signal and to respond to it as rapidly as pos- 
sible, but this may need qualification. Loeb 
and Schmidt found a reaction-time increment 
with 60 db. auditory signals when their sub- 
jects were unpaid colleagues, but no incre- 
ments when competitively paid subjects were 
used. They also found that detection rate for 
10 db. signals was significantly lower for the 
former subjects (Loeb & Schmidt, 1960, 
1963). Sipowicz, Ware, and Baker (1962) 
found no detection-rate decrement in monitor- 
ing interruptions of a continuous light when 
monetary rewards based on results were paid, 
but there was a significant decrement other- 
wise. Pollack and Knaff (1958) found that 
monetary reward improved overall detection 
rate but reported no difference in decrement. 
Further improvement in detection rate, but 
again no difference in decrement, was ob- 
tained by giving punishment (a loud siren 
when a signal was missed) instead of reward. 
Bergum and Lehr (1964) gave monetary 
reward and found that its effect was short- 
lived and that its removal led to further 
performance deterioration beyond that of the 
control group. A 

An alternative means of varying subject 
motivation has been to provide knowledge of 


ts of performance, that is, whether a sig- 
< g has been correctly or falsely reported or 
“missed, and whether it has been responded to 
more or less rapidly than the previous one. 
‘Loeb and Schmidt reported no reaction-time 
increment to 10 db. signals for their paid 
‘Subjects when true knowledge of results was 
given, and a slight but significant increment 
with false but supposedly true knowledge of 
results. With simple acknowledgment that a 
o response had been made, increment was no 
different than under control conditions. These 
data point to a true motivational effect of 
“knowledge of results over and above the pos- 
‘Bible facilitative effect of exposing the tem- 
poral sequence of the signals, as acknowledg- 
“ment does, and informing the subject that his 
‘Performance is deteriorating, which false 
knowledge of results fails to do. McCormack, 
Binding, and Chylinsky (1962) used the flash- 
ing-light test, and their data confirmed the 
facilitative effect of true knowledge of results 
‘and the null effect of acknowledgment. In a 
ious experiment, McCormack (1959) had 
fund a small but significant increment with 
Knowledge of results. McCormack, Binding, 
and McElheran (1963) showed that the effect 
could be obtained even when less than full 
mowledge of results was given—that is, 
Knowledge of results was withheld from some 
Tesponses—and McCormack and McElheran 
(1963) showed that the effect was nullified 
‘Only when it was withheld from as many as 
80% of the responses. 
_ Similar results have been reported in respect 
to detection rate. Sipowicz et al. found that 


gained in overall detection rate if both 
itions were combined. Using the same 
ask as Sipowicz et al., Weidenfeller, Baker, 
ind Ware (1963) showed that true knowledge 
t results and false-supposedly-true knowl- 
dge of results equally prevented the decre- 
Ment that occurred both under control condi- 
Ons and when the light signal which was 
ised to indicate misses was flashed at ran- 
fom. Hardesty, Trumbo, and Bevan (1963) 
Compared knowledge of results given by light 
Signals with the effect of that given by the 
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experimenter and found that while the latter 
produced a higher overall detection rate, both 
prevented the control-condition decrement 
equally well. The effect of motivating condi- 
tions was apparent in subsequent sessions 
when knowledge of results was withheld. This 
is one instance in which subjects cannot be 
used as their own controls, and this appears 
to be true of motivational factors in general. 

Summary. In general, there appears to be 
grounds for concluding that certain factors 
which lead to changes in reaction-time incre- 
ment may also lead to comparable changes in 
detection-time decrement. In particular, this 
is true of variations in signal detectability, 
signal rate, signal regularity, and knowledge 
of results; and in the effects of rest pauses, 
repeated sessions, and concurrent tasks. The 
evidence relating to number of signal sources, 
choice versus simple responding, and mone- 
tary reward is less conclusive. In regard to 
the factors of spatial dispersion and sequential 
probabilities, there appears to be no evidence 
relating to detection-rate decrement. 


Covariance of Reaction Time and Detection 
Rate with Time on Task 


In the previous section, reaction-time data 
have been cited from 27 references. In 17 of 
these, the experimenters did not report detec- 
tion-rate data except to observe that it re- 
mained uniformly high, in the region of 90- 
100%. It is therefore clear that there are 
circumstances in which reaction time may be 
sensitive to deterioration in monitoring per- 
formance while detection rate is insensitive. 
In the remainder of these cases, both reaction 
times and detection rates were reported, and 
it is possible to consider whether the two in- 
dices covary in the manner predicted on the 
assumption that they are both indices of the 
same hypothetical construct. In general, this 
must be done by inspecting the tabular or 
graphical data of mean reaction times and 
detection rates for successive periods of the 
experimental session, since, with one excep- 
tion, no computed correlation was reported, 
By this means, evidence in support of the 
hypothesis is provided by Mackworth (1950) 
(benzedrine experiment, synthetic-radar test 
and main listening test), Garvey et al. (1959) 
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(eight-voltmeter test), Loeb and Schmidt 
(1960, 1963) (10 db. auditory signals with- 
out true knowledge of results), and Loeb and 
Hawkes (1961) (1.2 db. electrical signals). 
Rank order correlation coefficients computed 
from the published data range from .20 to 
.83 in the predicted direction. Jenkins (1958) 
reported graphical data which shows a high 
correlation between reaction time to periph- 
eral light signals and detection rate for 
voltmeter-pointer signals. Whittenburg et al. 
(1956) reported that detection rate and reac- 
tion time provided similar results in respect to 
a clock test, but details of the latter were not 
given. Haider (1963) computed the correla- 
tion between reaction time and detection rate 
in his industrial experiment and reported a 
value of .19 in the predicted direction. It was, 
however, nonsignificant. 

Data which are contrary to the hypothesis 
have also been reported. Hawkes and Loeb 
(1962) found that reaction time for 1.2 db. 
electrical signals presented against a blank 
background remained stable while detection 
rate decreased with time on task. This ap- 
pears to be inconsistent with the results of 
Loeb and Hawkes (1961), but no explana- 
tion was suggested by the experimenters. 
Loeb and Hawkes (1962) found a detec- 
tion-rate decrement for 5.1 db. electrical 
signals which was associated with an im- 
provement in reaction time. In this case, 
however, the task involved the detection of 
signals which were of longer duration than 
regularly presented background stimuli. The 
experimenters suggested that reaction time 
may not be a suitable index for measuring per- 
formance to a change in duration. Binfold and 
Loeb (1963) found the customary detection- 
rate decrement in respect to a clock test but 
reported that reaction time remained stable, 
contrary to the results of Whittenburg et al. 

Apart from these three exceptions, the 
experimental evidence confirms the prediction 
that when detection rate decreases, reaction 
time increases; and that as fewer signals are 
detected, the monitor takes longer to respond 
to those which he does detect. At the same 
time, the converse is not true: reaction time 
may increase with time on task even though 
detection rate remains high. 
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Physiological Correlates of Perceptual Vigi- 
lance 


Prolongation has been specified as the basic 
requirement for demonstrating a change in 
vigilance, and time on task has therefore been 
taken as the independent variable with which 
changes in reaction time have been correlated. 
Time on task is in many respects an un- 
satisfactory measure to use as an experimental 
variable, and, for this and other reasons, at- 
tempts have been made to define physiological 
variables which may be used as direct objec- 
tive measures of vigilance. Such measures 
must be validated with reference to perform- 
ance measures which are the primary indices 
of vigilance, and studies concerned with this 
problem provide further evidence of the rela- 
tion between reaction time and detection rate. 

Cortical activity. Electroencephalographic 
(EEG) recordings show that the electrical 
activity of the brain is related to gross be- 
havioral changes, and that, in particular, the 
assumption of an attitude of alert attentive- 
ness is associated with the appearance of low 
amplitude, high frequency waves in place of 
the higher amplitude alpha waves of 8-12 
cycles per second which typify a state of 
relaxed wakefulness. Lindsley (1952) has 
proposed a theory whereby strong emotion, 
alert attentiveness, relaxed wakefulness, 
drowsiness, and sleep are sectors on a single 
continuum which is correlated with EEG 
wave patterns of decreasing frequency and in- 
creasing amplitude. Oswald (1962) has postu- 
lated cerebral vigilance as the neurological 
state underlying changes in EEG recordings 
and has suggested that cerebral vigilance 1S 
the neurological correlate of perceptual vigi- 
lance. 

A correlation between reaction time and 
EEG amplitude in sleeping subjects, of the 
kind predicted by this theory, was found by 
Coleman, Gray, and Watanabe (1959). They 
presented loud noise signals as their subjects 
passed from relaxed wakefulness to deep sleep 
and recorded the time taken for them to be 
switched off. There were some instances ° 
very long reaction times during light sleep, 
contrary to prediction, but the experimenters 
suggested that the subjects may have incorp®- 
rated these signals into their dreams. At 4 dif- 
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ferent point in the continuum, Lansing, 
Schwartz, and Lindsley (1959) found shorter 
reaction times when a light signal occurred 
during the alpha-wave blockage evoked by a 
warning signal. The optimal foreperiod was 
equal to the time taken for the warning 
signal to effect alpha-wave blockage, and 
signals presented within that interval elicited 
longer reaction times. No significant difference 
of reaction time was obtained for signals 
presented without warning in periods when 
alpha waves were spontaneously absent. The 
experimenters suggested that this is because 
alpha waves disappear with transient chemical 
and electrical changes in the brain and not 
only when the subject becomes attentive. If 
this is so, it means that EEG activity is a less 
than perfect correlate of perceptual vigilance. 
An alternative suggestion which does not have 
this implication is that their subjects became 
attentive with alpha-wave blockage, but not to 
the signal that was about to be presented. 
Groll (reported by Haider, 1963) related 
EEG activity to reaction times to signals 
presented in the form of a monitoring task. 
She recorded the EEGs of subjects reclining 
in isolation in a dimly lit room while they 
monitored infrequent, irregular light signals, 
and found a negative correlation between re- 
action time to detected signals and EEG 
frequency in the 1-second interval preceding 
signal presentation. The correlations in respect 
to EEG frequencies in the preceding second 
and third seconds were smaller but in the 
predicted direction. 

Haider reported data relevant to detection 
rate in Groll’s experiment. The EEGs were 
Inspected for all measurable waves below 12 
cycles per second occurring in the 3-second 
period preceding signal presentation. It was 
found that the mean value of wave counts in 
Periods preceding unobserved signals was sig- 
nificantly lower than that for periods preced- 
ing observed signals, and from this result a 
Positive correlation between detection rate 
and EEG frequency may be inferred. There 
appears to be no other evidence concerning 
the relation between EEG activity and overt 
response to the detection of a signal. Zung 
and Wilson (1961) presented auditory signals 
(natural sounds) to sleeping subjects as they 
Passed through the stages of sleep and noted 
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the frequency with which they elicited a re- 
sponse, but for their purpose they defined a 
response as the appearance of EEG activity 
characteristic of a lighter stage of sleep. They 
found that the probability of a response oc 
curring (signal-detection rate) was related tc 
EEG activity at the time of presentation 
They also found that their subjects coulc 
awaken on the presentation of a prearrangec 
sound distinguished from casual sounds, anc 
that differential awakening (signal-detectior 
rate) was better when the EEG indicated « 
lighter stage of sleep. 

Other physiological indices. Attempts made 
to relate other physiological variables to mon: 
itoring performance have included studies o0! 
skin conductance, pulse rate, and electric 
activity in the body musculature. Dardanc 
(1962) studied reaction time, and Ross 
Dardano, and Hackman (1959) studied detec: 
tion rate, in relation to skin conductance. Al 
though these two studies produced some evi 
dence that high conductance is related to good 
performance, it is not conclusive. Haider 
(1963) reported a relation between pulse 
rate and detected versus undetected signals 
but there was no clear relation with reaction 
time. Kennedy and Travis (1947, 1948) were 
more successful in showing a relation be- 
tween monitoring performance and electric 
activity in muscles not used in the response 
movement. They recorded the action poten- 
tials in the supraorbital muscles and found 
that as the subject rested quietly these tendec 
to decrease in frequency. In the first experi: 
ment, they presented a light and buzzer signal 
as activity fell to a prechosen level and founc 
that reaction time was negatively correlatec 
with muscle-spike count and that complete 
failures to respond were associated with very 
low counts. A positive correlation betweer 
detection rate and muscle-spike count may be 
inferred from these results. In the second 
experiment, the subject monitored a peripheral 
light signal in a 2-hour session during which 
he concurrently performed a tracking task. 
Signal presentation was determined by the 
amplitude of the supraorbital muscle action 
potentials (not their frequency, as in the first 
experiment) and the critical amplitude was 
varied as the session proceeded. Reaction 
times increased as the amplitude for presenta- 
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covariance with vigilance decrement—under 
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not). It may be deduced, therefore, that 
vigilance oscillates as it falls so that detection 
rate represents the proportion of time is a 
unit period for which vigilance lies above the 
critical level. These points have been iscam- 
porated into a theoretical model of the rele 
tion between vigilance and its two Indices, 
illustrated in Figure !. In this model, the 
has been made that vigilance teeds 

to fall as the monitoring session proceeds, and, 
for the purpose of the illustration, it has bees 
assumed that it falls in a straightforward 
manner. The curves for reaction time and 
detection rate have, then, been deduced from 
the relation between vigilance and the critical 
level. 
According to the model, the onset of detec 
tion-rate decrement is determined by the rela- 
tion between initial vigilance level and the 
critical level which in turn is related to signal 
intensity and duration. If the critical level is 
sufficiently high, that is, signals are transient, 
detection-rate decrement may be found from 
the onset of the session even though initial 
vigilance is high and experimental evidence 
confirms this. If the critical level is low, that 
is, signal intensity and duration are high, 
detection rate will be uniformly high, as ex- 
perimental evidence confirms. The model im- 
plies, however, that a point will eventually 
be reached, if the session is long enough, 


timo on task s 
signal-detection rate and reaction time 4$ 


when detection rate will begin to fall even 
though signal intensity and duration re 
mais constant. This deduction is plausible if 
drowiness and sleep are regarded as extreme 
manifestations of reduced perceptual vigi- 
lance. Prolongation of the monitoring session 
without interruption of the prevailing monot- 
enous conditions may be conducive to sleep, 
and cases of subjects falling asleep during 
vigilance experiments have in fact been re- 


ported. Oswald (1962) cited experiments in 
which sleep was induced by means of intense 


monotonous stimulation so that high signal 
intensity as such does not preclude this pos- 
sibility. There are grounds, therefore, for 
regarding this implication of the model as 
acceptable, although strict experimental vali- 
dation remains to be carried out. 
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HEART-RATE CHANGE AS A COMPONENT OF THE 
ORIENTING RESPONSE ' 
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system and from other sulosomic systems asd sagged, specifically, 
beart-rate (HR) acceleration should be amecisted with 


responses of HR deceleration and that 


reflected a “defense,” “startle,” or “acoustic-cardiac” response. 


Two developments have emerged recently 
in the field of psychophysiology which have 
important implications for the use of physio- 
logical measures as indices of psychological 
Mates. The Laceys (Lacey, 1959; Lacey, 
Kagan, Lacey, & Moss, 1962; Lacey & Lacey, 
1958, 1964) have described specific heart-rate 
(HR) changes associated with complex situ- 
ations involving attention and internal prob- 
lem solving, while Sokolov (Roger, Voronin, 
& Sokolov, 1958; Sokolov, 1960, 1963a, 
1963b; Sokolov & Paramonova, 1961a, 1961b; 
Vinogradova & Sokolov, 1957; Voronin & 
Sokolov, 1960) has shown that autonomic 
changes are part of the orienting reflex (OR), 
a generalized response system which has 
major effects on learning and perceptual 
Processes. A basic conception of both ap- 
Proaches is that autonomic feedback plays a 
critical role in amplifying or reducing the 
effects of stimulation. The approaches differ 
in many respects, but they appear to conflict 
only in interpretation of the role of HR 

. The purpose of the present paper is 
to examine this t conflict. 

Sokolov (1963b) has cited a growing body 
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Obrist (1963). The hypothesis has also been 
used to study attention in children (Kagan & 
Rosman, 1964) and in 6-month-old infants 
(Kagan & Lewis, 1965). 

Sokolov also implicated cardiac changes in 
the control of environmental inputs by includ- 
ing them as a component of the OR, a special 
functional system which serves to enhance 
sensitivity to external stimuli. He has cited 
extensive evidence from other Russian in- 
vestigators and from his own laboratory to 
demonstrate the role of the OR in lowering 
sensory thresholds. He has also described in 
detail many aspects of the complex combina- 
tion of somatic and autonomic reactions 
which form the system. Only casual mention 
is made of cardiac reactions, however, and he 
has not discussed the significance of the direc- 
tion of cardiac change. We are left to infer 
from one example (1960, p. 235) and one 
indirect reference (1963a, p. 546) that HR 
acceleration accompanies the OR. On the basis 
of Lacey’s work, deceleration would be ex- 
pected. 

In examining this inconsistency in the re- 
ported direction of HR changes, no attempt 
will be made to evaluate changes occurring in 
the kind of complex situations which the 
Laceys have employed. These have included, 
among others, the presentations of fluctuating 
white noise, a dramatic recording, a single 
letter which was to be used in constructing 
sentences, and a series of arithmetic prob- 
lems. While Obrist (1963) confirmed the re- 
sults, it is difficult with such complex situa- 
tions to ascribe HR differences unequivocally 
to any one dimension of situational differ- 
ences, and it is possible that other character- 
istics than the acceptance-rejection dimension 
could account for the HR findings. However, 
if the Laceys are correct in their inferences 
from neurophysiological evidence, their hy- 
pothesis should be able to predict changes in 
the kind of simple situations usually used to 
study the OR 


3 In inferring that similar cardiac responses should 
occur with the OR and during relatively prolonged 
attention to complex stimuli, it is not assumed that 
attention and orienting are identical processes. How- 
ever, HR changes are presumed from Lacey’s hy- 
pothesis to be especially relevant to the feature that 
both processes have in common and that both Soko- 
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While Sokolov apparently assumed that 
HR accelerates with the OR, the present 
authors were unable to find data supporting 
this assumption, except for the example cited 
above. Therefore, other studies measuring HR 
under similar conditions have been surveyed. 
For the sake of simplicity, the survey was 
restricted to studies using brief nonsignal 
stimuli, that is, stimuli incapable of reinfore- 
ing other responses and not associated with 
reinforcing stimuli either through condition- 
ing procedures or instructions. 

Sokolov described three major classes of 
response which may be elicited by such stim- 
uli—rientation, defense, and adaptation. The 
orientation class, or OR, is a system of un- 
conditioned motor, autonomic, and central 
responses elicited by any change in stimula- 
tion, independent of stimulus quality. Thus, 
both heat and cold evoke an OR on the first 
presentation although, upon repetition, each 
evokes a distinctive “adaptation” reaction. 
While derived from the earlier work of Pav- 
lov, Sokolov’s OR is more restricted than the 
relatively complex chain of conditioned and 
unconditioned exploratory-investigatory re- 
flexes described by Pavlov and is, in addition, 
explicitly related to the control of sensitivity 
to stimulation. 

To distinguish an OR, as Sokolov defines 
it, from “defense” reflexes, which are also 
independent of stimulus quality, certain cri- 
teria are available: (a) An OR is elicited by 
stimuli of low or moderate intensity while 
defense responses occur when the stimulus 
intensity is relatively high. (b) With the OR, 
there should be associated reciprocal responses 
of peripheral vasoconstriction and cephalic 
vasodilation. With the defense response, there 
are concomitant responses of constriction 1n 
both head and periphery. (c) An OR has the 
same response pattern to both onset and off- 
set of a stimulus since both are changes 17 
stimulation, This is not true of either defense 
or adaptation responses. (d) Unlike adapta- 
tion and defense responses, which tend to be 
intensified by stimulus repetition, the OR 
diminishes rapidly (habituates) when a stim- 
ulus is repeated. 
TEE a ot Se glee 
lov and the Laceys have emphasized—the feature of 
enhancing sensitivity to environmental inputs. 
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Unfortunately, the majority of HR studies 
do not provide sufficient information to iden- 
tify an OR. It cannot be assumed that any 
given stimulus intensity is in the range which 
evokes orientation rather than defense unless 
it is known what intensities this range in- 
cludes. In the case of auditory stimulus in- 
tensity, Sokolov has given guidelines which 
will be discussed later. Satisfaction of any of 
the other criteria would provide evidence that 
the stimulus elicited an OR; however, only 
one study was found which measured cephalic 
vasomotor responses, few described the re- 
sponse to stimulus offset, and many averaged 
responses across a series of trials. A response 
averaged across trials would fail to show 
whether there was habituation or intensifica- 
tion with stimulus repetition and would be 
more likely to reveal whatever response re- 
placed the OR than the OR itself. 

The method of measuring response on a 
single trial also needs to be considered in 
evaluating the HR literature. Shortcut meth- 
ods, including averaging over relatively long 
periods of time or selecting small samples of 
activity, have frequently been used. Such pro- 
cedures cannot detect brief responses time- 
locked to a stimulus or inversions in the di- 
tection of rate change. If, for example, HR 
increased during the first 5 seconds following 
a stimulus and then decreased below baseline 
in the next 5 seconds, the average HR for 10 
seconds would reveal no change or would re- 
flect only whichever component, acceleration 
or deceleration, was larger. Similarly, if 
a sample of activity is taken, such as the 
difference between the 12 fastest beats in 
a l-minute period preceding and following 
stimulus onset, a diphasic response will not 
be detected. Further, the sampling method 
‘an confuse a change in variability with a 
change in average HR. 

A few studies were excluded from consid- 
tation because they used different criteria to 
characterize HR preceding from HR follow- 
ig stimulus onset. For example, in one study, 
Prestimulus HR for the 2 immediately pre- 
ceding seconds was compared with poststimu- 
lus rate for the 2 seconds within 10 sec- 
onds in which the fastest rates occurred. In 
‘nother study, the three fastest beats in a 
5-second Poststimulation period were com- 
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pared with the three fastest beats in a 10- 
second prestimulation period. Even if means 
and variances of the two periods were equal, 
such measures would yield apparent HR 
changes because different proportions of the 
distributions would be sampled. Although 
valid for comparing the effects of different 
treatments, they do not give a valid measure 
of the differences between pre- and poststim- 
ulation activity. 

In reviewing the relevant studies, those em- 
ploying adult human subjects are considered 
first and in the greatest detail, since both the 
Laceys and Sokolov have derived their con- 
ceptions largely from work with such sub- 
jects. The discussion considers two possi- 
bilities: (a) that HR acceleration is a phasic 
and HR deceleration a tonic aspect of the 
OR, and (6) that HR acceleration is part of 
the defense reflex and HR deceleration a com- 
ponent of the OR. Findings with infrahuman 
and newborn human subjects are reviewed 
more briefly. 

This review will not consider the question 
of whether HR changes are affected by the 
energy requirements of a situation. Although 
the question has received little systematic 
investigation, it is assumed that such effects 
do occur and that this may be a second di- 
mension interacting with the receptivity re- 
quirements of a situation to determine the 
final response. 


ApuLt HUMAN STUDIES 


Phasic and Tonic Aspects of the OR 


Sharpless and Jasper (1956) have extended 
the distinction between phasic and tonic 
skeletal reflexes to the EEG alpha-blocking 
responses, a component of the OR. The phasic 
aspect has short latency and brief duration; 
the tonic aspect has a longer latency and 
longer duration. In the intact organism, the 
two aspects are separable only during par- 
tial habituation when, according to Sharpless 
and Jasper, the phasic type of reaction alone 
is elicited. However, with brain lesions, they 
were able to show dissociation of the two 
aspects, the phasic response being controlled 
by the thalamic portion of the reticular for- 
mation and the tonic by the brain-stem por- 


tion. 
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Sokolov also distinguished phasic and tonic 
responses, the phasic being brief, discrete re- 
sponses such as the GSR, while tonic re- 
ferred to changes in baseline level. As Ber- 
lyne (1960, p. 94) observed, Sokolov’s lo- 
calized-generalized dichotomy appears to be 
closer to the Sharpless and Jasper use of the 
terms tonic and phasic. Sokolov himself 
(1963b, p. 264) connected localized reflexes 
with the more selective action of the thalamic 
reticular formation, and generalized reflexes 
with the diffuse action of the stem part. He 
also noted that the generalized reflexes showed 
more rapid habituation, that is, decrement 
with repeated presentations of a stimulus. 

Thus, whether the terms phasic-tonic or 
local-general are used, there is precedent for 
distinguishing brief, short-latency, and slowly- 
habituating aspects of the OR, presumably 
controlled by thalamic mechanisms, from 
longer-latency, longer-duration, and rapidly- 
habituating aspects under brain-stem control. 
It is possible, therefore, that inconsistency in 
the Sokolov and Lacey reports of HR change 
reflects concern with two different aspects of 
a diphasic response—HR acceleration being 
the initial phasic reaction, and deceleration a 
later, more prolonged, tonic response. This 
would imply different central mechanisms 
controlling the two aspects of the response, 
and is compatible with the existence of medul- 
lary and hypothalamic centers for control of 
cardiac activity. This possibility is also plausi- 
ble in view of the different methods employed 
by Sokolov and Lacey to measure HR re- 
sponse. Sokolov has been concerned with the 
relatively brief responses immediately follow- 
ing a stimulus and, typically, has presented 
individual continuous recordings in support 
of his conclusions. Lacey, on the other hand, 
has generally measured response over 1-min- 
ute periods and has selected an index of the 
response—the difference between the 12 fast- 
est beats during the minute preceding and 
following stimulus onset. 

It should be possible to empirically resolve 
the question of whether the OR has two car- 
diac phases, but when the several studies 
most closely approximating the relevant con- 
ditions were reviewed, results were found to 
be conflicting. Some studies reported a di- 
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phasic response, but some found only decel- 
eration. 

The best known of these studies is prob- 
ably the Davis, Buchwald, and Frankmann 
monograph (1955) describing a variety of 
autonomic and muscular responses to single 
stimuli, These authors first made a detailed 
analysis of HR changes for 20 beats following 
and for 5 preceding the onset of a 98 db., 800 
cycle tone of 2-second duration. Averaged 
over 10 trials, this analysis revealed a diphasic 
response with an initial period of accelera- 
tion followed by a longer period of decelera- 
tion at about 5 seconds poststimulus onset. 
The decelerative phase fell below the pre- 
stimulus level. The authors next considered 
trial-by-trial changes but reported only a sin- 
gle index of the response, that is, the changes 
during the interval of maximum deceleration, 
5 to 7.5 seconds poststimulus onset. There 
was habituation of this decelerative com- 
ponent, response on the tenth trial being only 
about 20% as great as that on the first. No 
information was given concerning the accel- 
erative component. In a second experiment, a 
diphasic response pattern was again found for 
each of three 1000-cycle tones differing in 
intensity (70, 90, and 120 db.). The data 
were averaged for the four presentations of 
each stimulus, but marked habituation was 
presumably prevented by interspersing pres- 
entations of the different intensities. A third 
experiment investigated the nature of re- 
sponse to several tactual and thermal stimuli 
applied twice each. With these stimuli, HR 
responses appeared to be deceleratory only, 
although it is possible that a brief acceleration 
might have been missed by the method of 
grouping data into 2.5-second intervals. A 
later paper by Davis and Buchwald (1957) 
employed the same grouping interval and 
again reported a decelerative HR response, 1 
this case, to pictures. 

Lang and Hnatiow (1962) also reported a 
diphasic cardiac response to a simple audi- 
tory stimulus. With onset of an 85 db., 800 
cps tone of 5-second duration, the heart be- 
gan to accelerate, reaching a maximum ap- 
proximately four pulses later. This was fol- 
lowed by a long, frequently erratic period 0 
deceleration. Of particular interest is ther 
finding that the decelerative phase diminished 


HEART-RATE CHANGE 


markedly with stimulus repetition while the 
accelerative phase was relatively persistent. 

Two further studies found some evidence 
of a diphasic response although the evidence 
is questionable. Geer (1964) presented a curve 
for the first trial of a conditioning experiment. 
The curve showed a diphasic response to the 
onset of the 2-second visual CS, but the ini- 
tial acceleratory phase was small and lasted 
only 1 second. In addition, acceleration was 
intensified on later trials of a control group, 
receiving the CS alone, while the originally 
large decelerative phase was completely ha- 
bituated. Rudolph (1965) also obtained di- 
phasic curves in response to the first three 
presentations of a 10-second, 75 or 95 db. 
sound, but statistical tests showed that only 
the decelerative component, beginning at 4-6 
seconds after onset, was significantly different 
from the prestimulus baseline. 

Thus, three studies agree in suggesting a 
diphasic response to the initial presentations 
of an auditory stimulus, and one study found 
such a response to a visual stimulus. The 
acceleratory phase appeared to be relatively 
small, of questionable reliability, and rela- 
tively resistant to habituation. 

_ Other studies have found only a decelera- 
tive response. As noted above, Davis et al. 
(1955) and Davis and Buchwald (1957) ob- 
tained deceleration to visual, tactual, and 
thermal stimuli. Deceleration was also noted 
by Kanfer (1958) on the first four beats fol- 
lowing the first presentation of a 25 db. tone, 
and Wilson (1964) reported deceleration as 
the initial-trial response to a 3-second audi- 
tory stimulus “of moderate intensity.” Sub- 
Jects in these latter two experiments were 
instructed that they would receive a series of 
tones and shocks. As will be discussed below, 
Such instructions may significantly alter the 
experimental conditions, but presumably only 
n a direction that would increase rather than 
decrease the probability of eliciting an OR. 
Zeaman, Deane, and Wenger (1954) similarly 
found an initial decelerative phase in the re- 
sponse to a 60 db., 1-second tone when shock 
Was expected. Since the response was averaged 
over 20 trials, however, it may not be rele- 
vant to the present concern. Unpublished 
Work from our laboratory (W. Chase and F. 
Graham) has also found that an 18-second, 
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75 db. tone heard over 71 db. white noise 
elicited only deceleration. A more pronounced 
deceleration followed the unexpected turning 
off of lights in the subject’s chamber. 

One additional study (Dykman, Reese, 
Galbrecht, & Thomasson, 1959) should be 
mentioned. While the response measure could 
not detect a diphasic response, the fact that 
HR was more rapid in the first 5 seconds fol- 
lowing than preceding a 60 db. tone is rele- 
vant to the question of whether there is any 
accelerative component of the OR. Twenty- 
nine of 40 subjects showed acceleration on 
the first trial; 25 on the second trial; less 
than half on the third trial. 

The available evidence thus supports the 
hypothesis that HR deceleration is at least a 
component of the response when human adult 
subjects are presented with nonpainful sim- 
ple stimuli. Some studies also obtained an 
initial phase of acceleration. What is the evi- 
dence that these responses are part of the 
OR? None of the studies measured the re- 
sponse to stimulus offset, and only one (Davis 
& Buchwald, 1957) measured cephalic vaso- 
motor changes. In this case, the vasomotor 
response was one of dilation, suggesting that 
the accompanying HR decrease could be con- 
sidered a component of the OR. The fact 
that decelerative responses habituated rapidly 
provides further evidence. All studies investi- 
gating the effects of stimulus repetition (Davis 
et al., 1955; Geer, 1964; Lang & Hnatiow, 
1962; Rudolph, 1965; Wilson, 1964) agreed 
on this point. The situation is less clear with 
regard to the accelerative component. Lang 
and Hnatiow reported that acceleration was 
relatively persistent, and Geer’s curves showed 
that acceleration intensified with repeated 
presentation of the stimulus. Only Dykman 
et al. (1959) found rapid habituation of an 
accelerative response. The possible significance 
of the accelerative phase will be discussed 
further in the following sections. 


Defense Reflex 

A second generalized, functional system of 
unconditioned responses described by Soko- 
lov, the defense reaction, is also nonspecific 
with respect to quality of the stimulus elicit- 
ing it but does depend upon stimulus inten- 
sity. It is evoked by strong stimuli and its 


310 


function is to “limit” stimulus action. The 
cardiac rate changes accompanying defense 
were not explicitly discussed by Sokolov, but 
it would be expected from the Laceys’ analy- 
sis that HR should increase with a response 
system which serves to limit stimulus effects. 

It appears to be the case that HR increases 
in response to strong stimulation, at least in 
response to electric shocks of sufficient inten- 
sity to serve as the US in conditioning ex- 
periments. While reports on the direction of 
the conditioned HR response vary, an uncon- 
ditioned HR acceleration is virtually a uni- 
versal finding. An early study by Skaggs 
(1926) has sometimes been cited as an ex- 
ample of HR deceleration following shock. 
However, Skaggs’ tabled results show a rise 
in HR with warning of and immediately fol- 
lowing shock, although his description of the 
finding is ambiguous. 

The form and latency of the HR response 
to shock has not been systematically studied, 
but curves of the second-by-second response to 
a strong shock US were presented in three 
conditioning papers (Fuhrer, 1964; West- 
cott & Huttenlocher, 1961; Zeaman et al., 
1954). The curves showed pronounced accel- 
eration, rising to a peak in 3 to 4 seconds, and 
brief deceleration which occurred at approxi- 
mately the same point in time at which decel- 
eration appeared in the diphasic responses dis- 
cussed previously. However, HR remained 
well above prestimulus levels for at least an 
additional 4 seconds. The response was not 
followed further so that it is uncertain how 
long the increased rate persisted or whether 
there was a subsequent period of overcom- 
pensatory deceleration, Deane and Zeaman 
(1958) and Deane (1961) presented curves of 
response to a “mild” shock presented once. 
They found a similar but shorter acceleration, 
again with no second phase of deceleration 
below prestimulus level. The only report of 
acceleration which was followed by decelera- 
tion below baseline level has been made by 
Wilson (1964). He used a shock adjusted to 
the point “where it just became painful” and 
there was rapid habituation of this response 
with repetition of the stimulus. 

Less detailed evidence is available on the 
response to other intense stimuli. The study 
by Geer (1964) showed curves of response to 
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a 100 db. sound that was used as a US. These 
were similar to the response curves reported 
with shock. DeLeon (1964), Kaebling, King, 
Achenbach, Branson, and Pasamanick (1960), 
and Stovkis, Liem, and Bolten (1962) also 
reported acceleration to loud sounds. How- 
ever, Shock and Schlatter (1942) found only 
deceleration to sounds characterized as star- 
tling which included such stimuli as a loud 
snap, an auto horn, and a cap pistol. The re- 
sponse was measured in 3-second units, and 
there were no differences between stimuli 
rated most startling and those rated least 
startling. The actual intensity levels of these 
stimuli were not reported, and even the most 
startling stimuli were rated only near the 
midpoint of a five-point scale. Intense ther- 
mal stimuli have also been used. Immersing 
a limb in ice water has elicited HR accelera- 
tion in several studies (e.g., Engel, 1960; 
Lacey & Lacey, 1962; Obrist, 1963), and 
four intensities of a heat stimulus were re- 
ported by Malmo and Shagass (1949) to 
elicit “a slight average decrease” following 
stimulus offset. Their most intense stimulus 
was “definitely painful” but they did not re- 
port results separately for this and milder 
stimuli. 

A number of early studies of HR were re- 
viewed by Darrow (1929) in an effort to eval- 
uate an hypothesis that resembles the Lac- 
eys’. The hypothesis similarly associated 
ideational activity with HR acceleration but 
differed in suggesting that sensory stimuli, 
without regard to unpleasantness or intensity, 
tended to decrease HR. Darrow noted that 
this latter view was not supported unequivo- 
cally. In particular, Brahn (1901) and Zoneff 
and Meumann (1902) reported that pleasant 
stimuli retarded and unpleasant stimuli ac- 
celerated pulse rate, These studies will not be 
reviewed further here, The experimental con- 
ditions and response measures employed were 
often vaguely reported, and stimuli frequently 
involved contrived situations such as a "Ig8® 
chair which fell backwards 60 degrees (Blatz, 
1925), a hissing sound produced by burning 
a fuse in a glass of water (Kelchner, 1905), 
shouting at the subject to pay attention 
(Gent, 1903), and so on. 

It is difficult, therefore, to draw conclu- 
sions about the form of the HR defense 1°- 
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sponse from these studies. However, if it is 
assumed that the pattern found with shock 
and with loud sounds is representative, it 
appears that defense is characterized by rela- 
tively prolonged HR acceleration, 

This raises the question whether or not the 
accelerative component discussed in the pre- 
ceding section may be a defense reflex rather 
than an OR. Three of the studies reporting 
an accelerative component used auditory 
stimulation with intensities ranging from 70 
to 120 db. Although the lower values in this 
range might be labelled “moderate,” Sokolov 
speaks of 70 and 80 db. as “high” intensity 
stimuli giving rise to a defense reflex in one 
instance (1963b, p. 47), and, in another, as 
constituting a ‘“‘pre-pain zone” where there is 
a shift from OR to defense (p. 179-180). 
Further, Sokolov’s reference level was an in- 
dividually-determined threshold obtained with 
intermittent impulse stimulation using the 
method of limits. This procedure probably 
provided a lower reference level than the 1951 
American Standard for Audiometers 4 or the 
0002 microbar reference levels employed by 
the American studies cited. Judging by the 
differences between the 1951 American and 
the 1964 International Standard (Davis & 
Kranz, 1964), a correction of approximately 
10 db., varying with frequency, should be 
added to the intensities of the cited studies to 
make them comparable to Sokolov’s figures. 
Unfortunately, background sound levels were 
tarely reported, either by Sokolov or other 
Investigators, 

There is reason to suspect, therefore, that 
all three studies reporting a diphasic response 
to auditory stimulation (Davis et al., 1955; 
Lang & Hnatiow, 1962; Rudolph, 1965) used 
stimulus intensities strong enough to evoke a 
defense response either initially or within a 
few trials, In the border zone of prepain in- 
tensity, it is possible that both defense and 

R can be elicited simultaneously. The de- 
sative phase of the diphasic response ha- 

ituated rapidly which suggests that this 
a at least, was an OR, but the accel- 

ative phase may have been a component of 
è Weak defense reflex. The acceleration was 
42, 


Ras 24.3-1951: American Standards Association, 10 


t 40th Street, New Vork 16, N. Y. 


311 


small, resistant to habituation, and there was 
no clear evidence that it appeared on the 
first trials. An alternative explanation, which 
will be discussed later, is that the accelerative 
phase was a startle response. 

One study, by Dykman et al. (1959), us- 
ing a stimulus of 60 db., approximately the 
lower limit of Sokolov’s “pre-pain” zone, did 
report acceleration on the first trial and also 
found rapid habituation of the response. 
These authors commented that while their 
study was designed to investigate the OR, 
“the sudden and unexpected auditory stimu- 
lus was sufficiently loud to evoke a mild 
startle reflex in about one-third of the sub- 
jects. This appeared only on the first tone 
and was absent on the remaining stimuli.” 

Other studies using auditory stimuli did not 
find acceleration. None of these studies em- 
ployed intense stimulation. Kanfer (1958) 
used a 25 db. tone; Wilson (1964), a stimu- 
lus “of moderate intensity”; Chase and Gra- 
ham, from our laboratory, a tone of nearly 
the same intensity as the background white 
noise. Furthermore, instructions in the Kan- 
fer and Wilson studies probably would have 
served to make even a relatively strong stimu- 
lus capable of eliciting an OR rather than 
defense. Subjects were told that they would 
receive both shock and tones, and, while they 
were not told explicitly that shock and tone 
would be associated, it is possible that the 
instructions converted tone to a “signal” 
stimulus. According to Sokolov, signal stim- 
uli elicit an OR at higher intensities than 
nonsignal stimuli. 

Of the studies reviewed which used stimuli 
other than sounds, only Geer (1964) found 
an accelerative component preceding decel- 
eration. This study is particularly interesting 
because the accelerative component apparently 
increased with stimulus repetition while the 
decelerative component habituated. Geer de- 
scribed the response as diphasic on the first 
trial, but it is questionable whether the small 
increase in HR, lasting only 1 second, was a 
significant HR change. However, acceleration 
was clearly present on the fourth and fifth 
trials, by which time the decelerative phase 
had disappeared, and it was still more marked 
by Trials 19-20. This is what would be ex- 
pected if deceleration were the cardiac com- 
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ponent of the OR and acceleration a com- 
ponent of the defense reflex. With repeated 
stimulation, the OR habituates and is re- 
placed by a defense reflex that intensifies 
with further stimulation (Sokolov, 1963b, p. 
49 ff.). Therefore, if these changes with repe- 
tition were reliable, they argue against ac- 
celeration being a component of the OR. 
Some habituation, even of a phasic component 
of the OR, would be expected. 

The relationship between stimulus inten- 
sity, stimulus repetition, and the elicitation of 
OR and defense is complex when the direc- 
tion of a response component differs with the 
two reflex systems. If the direction of re- 
sponse is the same for a given component, as 
in the case of peripheral vasoconstriction, 
with increasing stimulus intensity, summation 
of the two reflexes strengthens the response 
and produces greater resistance to habitua- 
tion (Sokolov, 1963, p. 180). However, if the 
direction of response differs for a component 
of the two reflex systems, as in the case of the 
cephalic vasomotor response, increasing stim- 
ulus intensity first increases the degree of 
vasodilation and then, in the prepain zone, 
leads to weakened vasodilation and, finally, 
at still higher intensities, to vasoconstriction. 
Although the general rule is slower habitua- 
tion with increasing intensity, if defense and 
OR reactions differ, then replacement of the 
OR by a defense reaction occurs more rap- 
idly with higher intensities of stimulation. 
Thus, a response such as cephalic vasodilation 
should disappear with fewer repetitions as a 
stimulus that is within the range where de- 
fense replaces the OR becomes more intense. 
In contrast, a response such as peripheral 
vasoconstriction should be more stable with 
increasing stimulus intensity. 

A similar effect of stimulus intensity and 
repetition should hold for HR changes if the 
direction of response differs for orientation 
and defense reflexes. If HR deceleration is an 
OR which is replaced by a defense reflex of 
acceleration when stimulus intensity is suffi- 
ciently great, then within this replacement 
range HR deceleration should habituate more 
rapidly with higher stimulus intensities. If it 
is not replaced by acceleration, then it should 
habituate more slowly with higher stimulus 
intensities. Two studies are relevant to this 
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problem. Davis et al. (1955), comparing 
habituation of the decelerative response to in- 
tensities of 70, 90, and 120 db., found no 
significant difference in the rates of habitua- 
tion. However, there were only four presenta- 
tions of each stimulus, interspersed among 
the other stimulus intensities, so that rela- 
tively little habituation of any stimulus 
would be expected. Rudolph (1965), giving 
15 repetitions of a stimulus, did find more 
rapid habituation of a decelerative response 
in subjects receiving a 95 db. sound than in 
those receiving a 75 db. sound. This suggests 
that there was a change in the direction of the 
response with the shift from OR to defense. 
A simultaneous increase in acceleration should 
also have been obtained, but, while curves 
suggested that such a component was present, 
it was not statistically significant on either 
initial or later trials with either stimulus.° 
The above discussion adds to the evidence 
that HR deceleration is at least one phase of 
the cardiac component of the OR and HR 
acceleration is the cardiac component of the 
defense reflex. There is also considerable evi- 
dence that deceleration is the sole cardiac 
component of the OR. Several studies were 
reviewed which made a detailed analysis of 
the response to initial presentations of weak 
or moderately intense stimuli, and these stud- 
ies obtained only deceleration. This suggests 
that an accelerative phase is not a necessary 


5 After preparation of this review, a systematic 
study of auditory intensity effects was published by 
Uno and Grings (1965). Their results were m gen- 
eral agreement with previous data. Significant de- 
celeration occurred in response to 60 db. and a di- 
phasic response with significant acceleration to 70, 
80, and 90 db. re .0002 microbars. However, the 
response at 100 db. was less accelerated initially than 
the response at lower intensities and included a SiS 
nificant second phase of deceleration. This brena 
down of the otherwise consistent pattern 1s difficu 
to interpret whether one assumes that the HR com- 
ponent of an OR is diphasic, accelerative, °F de- 
celerative, Since additional information from the t 
thors indicates that it is also not accounted for by 
higher prestimulus levels at 100 db. or by imbalanc? 
in preceding stimulus intensities or intertrial inter- 
vals, the simplest explanation appears to be pee 
variation. A second finding lends support to va 
present hypothesis. There was a significant Intensity 
X Repetition interaction which was partly due WW id 
shift from deceleration to acceleration in the T° 
sponse at 100 db. 


HEART-RATE CHANGE 


component. Further, it appears probable that 
those studies reporting a diphasic response 
employed intensities in the range where de- 
fense is elicited or rapidly replaces an OR. 
The possible increase of acceleration with 
stimulus repetition, seen in the Geer (1964) 
study, supports this interpretation, Addi- 
tional support comes from infrahuman studies 
discussed in the following section. 

The question may be raised as to why ac- 
celeration and deceleration should occur on 
the same trial if they are components of dif- 
ferent reflexes. As Sokolov has illustrated in 
connection with the eye blink, when tenden- 
cies to elicit orientation and defense or adap- 
tation reflexes are simultaneously present, 
components of each can occur if the com- 
ponents are not incompatible. This would be 
the case if latencies were different. 

The possible mechanisms for control of HR 
change are sufficiently complex and varied so 
that short-latency changes under one form 
of control could occur before longer-latency 
changes controlled by a different mechanism. 
This is illustrated by an analysis of the physi- 
ological basis of the diphasic response com- 
monly found in conditioning studies. Obrist, 
Wood, and Perez-Reyes (1965) suggested that 
the short-latency, brief-acceleratory phase was 
due to a momentary loss of vagal tone asso- 
ciated with a respiratory gasp or larger in- 
spiration at stimulus onset. It could be re- 
duced by regularizing breathing (Wood & 
Obrist, 1964) and eliminated by pharmaco- 
logical vagal block. The longer-latency decel- 
erative phase was due to vagal restraint which 
masked simultaneous sympathetically-induced 
acceleration, By blocking pressor responses, 
Obrist et al, were able to show that the de- 
celeration phase did not result from a homeo- 
static reflex initiated by peripheral blood- 
pressure changes. 

It is impossible to similarly identify the 
basis for defense and OR changes in HR 
When a clear description of the form of the 
Te is still lacking. However, the above 
(Ba ysis as well as other physiological work 

azett & Bard, 1956; Bond, 1943; Dykman 
Seg 1959; Samaan, 1935a, 1935b) sug- 

Sts at least three separable aspects of the 
it response which may be relevant to the 

“sent discussion—a_short-latency accelera- 
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tion due to decreased vagal tone, a longer- 
latency acceleration sympathetically con- 
trolled, and a deceleration due to vagal dis- 
charge. The sympathetic acceleration may be 
associated with the defense reflex and vagally- 
induced deceleration with the OR. Accelera- 
tion due to loss of vagal tone may also be a 
component of the defense reflex but perhaps 
might better be considered a separate startle 
response for which the stimulus is “sudden- 
ness” of onset. An association between sud- 
den stimulus onset and startle has frequently 
been remarked (Dykman et al., 1959; Hoff- 
man & Searle, 1965; Landis & Hunt, 1939; 
Subbota, 1961). 

Fleshler (1965) investigated the question 
experimentally and found that behavioral 
startle in the rat was a function of acoustic 
rise time. The effective stimulus was the in- 
tensity reached within approximately 12 ms. 
of onset. Presumably, a stimulus rising slowly 
enough so that it fails to reach startle 
threshold within the first 12 ms. would not 
evoke startle at all even though it subse- 
quently rose to painful intensity. Conversely, 
even a mild stimulus might evoke startle if 
its peak intensity was above the startle 
threshold and it reached the peak in less than 
12 ms. Thus, it may be possible to have a 
slowly rising stimulus which does not evoke 
startle but evokes either defense or an OR, 
depending upon its peak intensity, and to 
have a rapidly rising stimulus which first 
evokes startle followed by either defense or 
orientation, again depending upon the final 
intensity reached. 

It should be noted that many methods of 
delivering sound stimuli produce large acous- 
tic transients at onset and these may some- 
times be sufficient to evoke startle. The prob- 
lem may be avoided by controlling rise time 
with an electronic switch. 

The studies discussed in this section em- 
ployed subjects in a waking state. Recently, 
Hord, Lubin, and Johnson (1965) presented 
30 db. tones during sleep. They found a di- 
phasic curve similar to that of Lang and 
Hnatiow (1962) but with a more pronounced 
accelerative phase during Stage 2 and rapid 
eye movements (REM) sleep. Because the 
OR does not habituate in sleep (Sokolov, 
1963a, 1963b), the curve, averaged over 120 
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to 291 stimulations, was assumed to represent 
the HR component of the OR. It is also possi- 
ble that it represents a composite startle-OR 
response, and an investigation of accompany- 
ing respiratory changes and the effects of rise 
time would be desirable. 


INFRAHUMAN STUDIES 


While few studies of infrahuman subjects 
are relevant to the present concern, there is 
substantial support for the hypothesis of a 
decelerative OR and an accelerative defense 
response. In addition to descriptions of the 
HR response to initial presentation of a novel 
stimulus, some information was obtained 
concerning the response to stimulus offset and 
concerning the course of change in response 
direction with repeated stimulation. 

Four studies showed that in rats (Black, 
1964; Stern & Word, 1961), in cats (Flynn, 
1960), and in dogs (Petelina, 1958), the first 
presentation of an auditory stimulus was fol- 
lowed by HR decrease. There was little con- 
trary evidence. Two studies on rats reported 
no change in HR following, in one case, sound 
stimulation (Holdstock & Schwartzbaum, 
1965) and, in the other, light stimulation 
(Bloch-Rojas, Toro, & Pinto-Hamuy, 1964). 
Petelina (1958) also found no clear reactions 
following light stimulation although an insig- 
nificant acceleration for one or two beats was 
noted. In other work, two illustrative proto- 
cols of individual dogs showed that accelera- 
tion, as measured by the difference in mean 
HR during 5 seconds preceding and 5 sec- 
onds following stimulus onset, was the re- 
sponse to the first few stimulations with a 
30-second or 5-second sound (Fleck, 1953; 
Robinson & Gantt, 1947). 

The deceleration reported by Stern and 
Word (1961) habituated in a few trials. In 
two separate experiments, they found signifi- 
cant HR decrease in the first 4 seconds fol- 
lowing the first sounding of a “house bell.” 
There continued to be a small but nonsignifi- 
cant deceleration for the next five trials and 
no clear change thereafter. They also found 
that even with a brief electric shock there was 
a decelerative response in the majority of 
animals on the first trial, although the mean 
HR change was not significant. On subse- 
quent trials, deceleration was replaced by 
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a significant acceleration which persisted 
through the 10 trials given. In a later condi- 
tioning study (Fehr & Stern, 1965), responses 
of a control group were considered only in 
10-trial blocks so that the response on the 
first trials could not be identified. However, it 
is of interest that the authors referred to the 
persistent acceleration obtained over 350 
trials as either “an orienting or a defensive 
response” which did not habituate “possibly 
because of the intensity of the stimulus (80 
db.).” 

Black (1964) also found that deceleration 
was replaced by acceleration after a few trials. 
On the first presentation of 40 db. white 
noise, 88% of 75 rats showed deceleration as 
the predominant response. For 10 rats given 
additional trials, acceleration replaced decel- 
eration on the fifth stimulation. The response 
measure was the greatest difference between 
HR in a 3-second prestimulus period and HR 
in any 3 seconds of a 20-second poststimulus 
period. This measure could not, of course, 
have detected a diphasic response nor change 
in a fixed period of time. However, it could 
indicate which response component (acceler- 
ation or deceleration) was greater on a given 
trial. XA 

Petelina’s findings with dogs were similar 
(1958). While all six subjects decelerated on 
the first presentation of a 60 db. tone, the 
response was considerably weakened on the 
next few trials, and on later presentations all 
dogs shifted to an accelerative response. 

Other work with dogs has been interpreted 
as contradicting Petelina’s findings, but such 
studies have not, in fact, described response 
to the initial stimulus presentations. Sever 
pioneering studies from the Pavlovian Lab- 
oratory of Johns Hopkins University "i 
specifically concerned with what was calle 
the “questioning reflex” or OR, but the m- 
vestigators apparently did not concelve of this 
as a rapidly habituating response and, wi 
the exception of individual illustrative a 
cols, reported only the average responses tOr 
series of trials, These averaged responses 
were usually acceleratory in response to vari- 
ous acoustic stimuli and deceleratory 1 = 
sponse to a blinking light (Robinson & Gan ji 
1947). As noted above, two protocols weni 
found which showed that even on the initia 
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presentations of sound, average HR increased 
during 5 seconds poststimulus onset. These 
same protocols showed a decelerative re- 
sponse at stimulus offset. The offset response 
habituated after a few trials, unlike the per- 
sistent postonset acceleration. Later work 
from this laboratory (Dykman & Gantt, 
1956; Reese & Dykman, 1960) found that 
250 to 440 trials were necessary to extinguish 
the accelerative response. 

A study by Soltysik, Jaworska, Kowalska, 
and Radom (1961) was an extensive and sys- 
tematic investigation of cardiac changes in 
dogs following sound stimulation. Unfortu- 
nately, responses were averaged in 10-trial 
blocks so that the first-trial responses were 
not described. However, this study did give in- 
formation on relative differences in the speed 
of habituation. Cardiac acceleration following 
onset of a 65 db. buzzer habituated very 
slowly over trial blocks, in contrast to rela- 
tively rapid habituation of a deceleratory re- 
sponse occurring at stimulus offset. It is in- 
teresting that acceleration late in the 10-sec- 
ond stimulus period habituated more quickly 
than the acceleration occurring in the first 3 
seconds. A similar phenomenon is noted be- 
low in the habituation of human neonatal 
responses, 

Sokolov has emphasized that the ORs to 
onset and offset are identical and, in com- 
menting (1963a, p. 547) on the paper by 
Soltysik et al. (1961), said that the accelera- 
tion at onset and deceleration at offset could 
not both be components of the OR. He ap- 
peared to accept those authors’ conception of 
an “acoustic-cardiac reflex” in which HR 
level varies directly with the intensity of 
sound and suggested that this reflex sum- 
mated with the OR. Apparently on the as- 
sumption that the cardiac component of the 
OR should be acceleratory, he remarked that 
Summation produced “an enhanced response 
i, the very beginning of sound stimulation.” 
oy the data of Soltysik et al. (1961, 
t 8. 7) showed more pronounced acceleration 
owards the end of the 10-second stimulation 
period than at the beginning. If summation 
ae therefore, it would require an ini- 

Y-decelerative OR to produce the effects 
obtained, 

Whether or not there are species differ- 
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ences in the rapidity with which the OR is 
habituated or in the thresholds for elicitation 
of a startle-defense reflex is an interesting 
question. Apparently, there is no gross dif- 
ference in the rate of habituation of the de- 
celeratory response. Deceleration was replaced 
by acceleration within five to six trials in the 
two rat studies (Black, 1964; Stern & Word, 
1961) and was replaced or disappeared be- 
tween the fourth and ninth trials in three 
studies using adult human subjects (Geer, 
1964; Rudolph, 1965; Wilson, 1964). There 
does appear to be a more persistent and pro- 
longed accelerative response to moderate 
sound stimuli in both rats and dogs than has 
been found with adult human subjects. 
Thorpe (1963) and Razran (1961) gave ex- 
amples of stimuli elicting startle responses 
that are particularly resistant to extinction in 
certain species and suggested that these differ- 
ences are related to ecological conditions af- 
fecting survival. 


HUMAN NEWBORN STUDIES 


It appears relatively difficult to elicit either 
a decelerative or a diphasic response in new- 
borns, even on initial presentations of a novel 
stimulus. While most studies have used re- 
sponse measures which could not have de- 
tected a diphasic response or have averaged 
responses for a number of trials, work in our 
laboratory has shown that on the first pres- 
entation of a 75 db. sound, the response was 
a wave of acceleration which was not fol- 
lowed by deceleration below the prestimulus 
levels (Chase, 1965; Graham & Keen, 1965; 
Keen, Chase, & Graham, 1965). The typical 
response was an acceleration beginning within 
2 seconds of stimulus onset and lasting for 
varying lengths of time, depending upon stim- 
ulus duration. No response to offset could be 
detected. Davis, Crowell, and Chun (1965) 
also analyzed the HR response beat-by-beat 
following onset of several types of stimuli. 
Four stimuli elicited a significant response on 
the first presentation—a puff of air to the ab- 
domen; acetic acid held 5 mm. from the nose; 
an 80 db., 4-second warbled tone; and 50 
db. auditory clicks. In each case, the response 
was an acceleration of HR. 

The accelerative response shows some decre- 
ment with repeated stimulation (Bartoshuk, 
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1962a, 1962b; Bridger, 1961) which may be 
due either to increasing prestimulus levels or 
to reduction in the response to prolongation 
of stimulation (Chase, 1965; Keen, Chase, & 
Graham, 1965). Complete habituation was 
not found even after 5 days with 15 stimulus 
presentations per day (Graham & Keen, 
1965). 

Although the stimuli used in newborn 
studies might fall within the prepain range 
or might have a sufficiently sudden onset to 
elicit startle even on the first trial, the ab- 
sence of any decelerative phase following ac- 
celeration distinguishes the response from that 
of the human adult. Rudolph (1965) and 
Chase (1965), using the same stimulating 
conditions, reported a diphasic response with 
significant deceleration in adult subjects while 
acceleration alone occurred in newborns. 

The work of Lipton and Steinschneider 
(1964) indicates that while deceleration is 
difficult to elicit in the newborn, it can be 
elicited within the first few months of life. 
Infants tested at birth and retested with the 
same stimuli at 2, 4, and 5 months of age, 
shifted from a purely accelerative response at 
the early ages to a diphasic response at the 
later ages. In another study (Kagan & Lewis, 
1965), 24-week-old infants presented with 
various visual stimuli also showed decelera- 
tion. A supplementary report (Lewis, Kagan, 
Campbell, & Kalafat, 1965) indicated that, 
in this case, there was no initial acceleratory 
phase unless it was one which habituated 
rapidly and was thus missed by the method 
of averaging across trials. 

These observations suggest an early devel- 
opmental change in the nature of the cardiac 
response to simple stimuli, The newborn is 
not unresponsive, but it is relatively difficult 
to elicit the decelerative response which, in 
mature human and infrahuman subjects, is 
apparently a component of the OR. It is 
relatively easy to elicit the prolonged accel- 
eration which is presumably a combined 
startle-defense reflex. 

How this pattern is related to the maturity 
of peripheral and central neural mechanisms 
is uncertain. The Scheibels (1964) have re- 
cently reviewed developmental neurophysio- 
logical research which documents the relative 
immaturity of neural structures at birth, and 
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Lipton, Steinschneider, and Richmond (1965) 
have reviewed investigations of early auto- 
nomic functioning. While many autonomi- 
cally mediated reactions are difficult to elicit 
in the newborn, others are hyperactive and 
there appears to be little evidence for pre- 
dominant control by either the adrenergic or 
cholinergic systems. 


DISCUSSION AND SUMMARY 


This review was undertaken to reconcile the 
conflict between Sokolov’s assumption that 
cardiac acceleration is a component of the 
OR, a system serving to increase sensitivity 
to environmental inputs, and the Laceys’ hy- 
pothesis that cardiac acceleration is associated 
with decreased sensitivity. While empirical 
verification of the Lacey hypothesis has been 
obtained in complex situations differing in 
many respects from the simple situations in 
which the OR is usually studied, it appears 
that if the reasoning from neurophysiological 
evidence which underlies the hypothesis is 
correct, HR deceleration should be a com- 
ponent of the OR. h 

When studies using simple nonsignal stim- 
uli were examined in the light of Sokolov’s 
criteria for identifying an OR, strong evi- 
dence was found that HR deceleration 1s 4 
major component of orientation. In brief, the 
evidence showed, first, that on the initial pres- 
entations of a stimulus to adult human oF 
infrahuman subjects, HR decreased when- 
ever the measure of HR change permitted 
identifying such a response. Human infants 
did not exhibit the response until a few 
months after birth, Second, it showed that 
whenever the effects of stimulus repetition 
were measured, rapid habituation of the de- 
celerative component occurred. In addition, 
one study found that HR decrease to visa 
stimuli was accompanied by cephalic vasodi- 
lation, and there is evidence from another 
study that the response to stimulus offset 15 
also a deceleration. 

The Laceys’ hypothesis would further pre- 
dict that HR acceleration should be a COM- 
ponent of the defense reflex described by 
Sokolov as “limiting” stimulus action. There 
is support for this prediction. First, 4 et 
tively prolonged HR acceleration followe' 
strong stimulation, Second, with some excep- 
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tions, when an accelerative response was ob- 
tained, it was markedly resistant to habitua- 
tion. This was particularly true in infrahuman 
studies. Studies with human subjects gave 
fewer trials, permitting only the conclusion 
that an accelerative response is relatively 
difficult to habituate. Third, there is some 
evidence that acceleration is intensified by 
stimulus repetition. 

Judged by Sokolov’s distinctions between 
defense and orientation systems, the data 
from studies of simple stimuli thus present a 
generally consistent picture of HR accelera- 
tion with defense and deceleration with the 
OR. The picture is complicated, however, by 
some reports of a diphasic response in situ- 
ations presumably appropriate for eliciting 
an OR. The response was one of short la- 
tency, brief acceleration which was followed 
by more prolonged deceleration. Since the 
short-latency acceleration is presumably medi- 
ated by loss of vagal tone rather than by 
sympathetic activity, its appearance in an 
OR situation would not necessarily be preju- 
dicial to the Lacey hypothesis. The hypothesis 
is based on the inhibitory effect that stimula- 
tion of baroreceptors has on cortical activity, 
and an acceleration not accompanied by 
blood-pressure change would probably not in- 
volve baroreceptor discharge. Obrist et al. 
(1965), measuring intra-arterial blood pres- 
sure, found no change in either systolic or 
diastolic pressure during the initial phase of 
a diphasic conditioned HR response. 

_ There are, in any case, objections to view- 
ing this initial acceleration as a “phasic” 
component of the OR. Under many OR stim- 
ulus conditions, it does not occur and so can 
not be a necessary part of the response. 
Further, it appears not to show decrement 
with repeated stimulation and, at least in one 
instance (Geer, 1964), may have been in- 
tensified by repetition. 

Several alternative views of this accelera- 
tive phase are possible, although present evi- 
dence is insufficient to decide among them. 
The alternative of a partially inhibited de- 
ense reflex, incompletely masked by the 
dominant OR in early trials, was considered. 
This is compatible with the fact that stimuli 
cliciting diphasic responses have generally 
een in the prepain zone of intensity while 
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low stimulus intensities have been followed by 
deceleration alone, except in sleeping subjects. 
Another alternative is that initial acceleration 
is a startle reflex which depends, not upon 
peak stimulus intensity, but upon the in- 
tensity reached within the first few milli- 
seconds, Finally, the possibility of an adapta- 
tion response specific to acoustic stimuli 
should be considered (Soltysik et al., 1961). 
The clearest evidence of an initial accelera- 
tory phase was obtained from studies using 
auditory stimulation, and, except for neo- 
natal studies, there were only two reports of 
acceleration with nonauditory stimuli (Geer, 
1964; Petelina, 1958). In neither case was 
the acceleration shown to be a significant HR 
change, On the other hand, the critical di- 
mension may not be auditory versus non- 
auditory stimulation, but whether stimulus 
intensity rose more rapidly with the auditory 
than with the nonauditory stimuli that have 
been employed. This factor, as well as in- 
tensity differences, should be controlled before 
differences are ascribed to sensory quality. 

This review of the HR literature offers en- 
couragement that orderly and psychologically 
meaningful relations exist between HR re- 
sponses and experimental manipulations. 
The order is not apparent without detailed 
consideration of second-by-second and trial- 
by-trial changes and might, even then, not 
emerge without the conceptual framework 
and objective criteria for identifying response 
systems which Sokolov has provided. Soko- 
lov’s assumption that cardiac acceleration 
accompanied the OR may be due to a failure 
to examine the question thoroughly. Without 
the insight offered by the Laceys’ hypothesis, 
no special interest would attach to the car- 
diac response, and inspection of HR change 
would not be very revealing with the record- 
ing methods that Sokolov employed. 

The findings not only support the Laceys’ 
hypothesis but, by implication, strengthen the 
position of both the Laceys and Sokolov that 
autonomic changes are important in the con- 
trol of sensitivity to stimulation, The findings 
further suggest that change in HR may be 
a particularly useful response in psychological 
investigations. It is probable that, as Kagan 
and Lewis (1965) proposed, HR deceleration 
may prove a valuable indicator of whether 


attention to 


It also appears that HR changes may prove 
more useful in differentiating defense and OR 
responses than the less reliably measured 
cephalic vasomotor response on which Sokolov 
has depended. 
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CHOLINERGIC SYNAPTIC TRANSMISSION AND ITS 
RELATIONSHIP TO BEHAVIOR * 


CAROL REEVES 


University of Michigan 


Neuropharmacological evidence reviewed strongly suggests that aeetylcholine 
(ACh) is a transmitter substance at many brain synapses. Behavioral evidence 
gives further support for this hypothesis. Rats with genetically high ACh levels 
learn consistently faster than rats from low-ACh strains, though only when 
massed trials are employed; the specific differences can be interpreted in terms 
of the theory of reverberation and consolidation of neural circuits. Other evi- 
dence suggests that a cholinergic system plays some role in the mediation of 
reinforcement and/or elimination of nonreinforced responses. 


Synaptic transmission is now widely ac- 
cepted as being chemically mediated. Acetyl- 
choline (ACh) is known to be the transmitter 
at_neuromuscular junctions and postgangli- 
onic parasympathetic nerve endings and in 
sympathetic ganglia, while noradrenalin medi- 
ates transmission at postganglionic sympa- 
thetic endings. Much work has been done in 
an attempt to find what the transmitter (s) 
in the central nervous system may be. Evi- 
dence for ACh has been accumulating in the 
neurochemical literature; in the past few 
years, papers discussing cholinergic mecha- 
nisms have begun to appear in both clinical 
SA experimental psychological literature as 

. There are suggestions that the amount 
” distribution of ACh in the brain may be 
cn, factors in some psychological processes, 
ASENA be an important source of individual 
Ee ces. Before the experimental evidence 
the ae some comments will be made on 
che ‘ort of theoretical framework in which 

Mical transmitters play a significant role. 


IT: 
the so paper was written in partial fulfillment of 
q oe arias for the honors program in psy- 
grateful he University of Michigan. The author 
moral A acl nowledges the valuable suggestions and 
are She provided by her tutor, Dr. Stephen 
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One of the major and as yet unsolved prob- 
lems in psychophysiology is to determine the 
neural basis of memory. It is thought that 
some structural change must occur, but its 
nature has so far not been ascertained. One 
thing which seems to be fairly well-estab- 
lished is that this change does, not occur 
immediately; rather, there is some sort of 
perseveration and consolidation process which 
goes on for a certain amount of time after 
a learning trial. This would seem to be ‘the 
only explanation for the findings that per- 
formance may improve over a period of time 
during which no overt practice occurs, and 
that if electroconvulsive shock (ECS) or cer- 
tain drugs are administered soon after a trial, 
learning is impaired, while if administered 
an hour or two later, there is no effect on 
learning. (Glickman, 1961, and Deutsch, 
1962, have reviewed work in this area.) 

D. O. Hebb’s (1949) theory of neural cir- 
cuits provides one way of accounting for 
perseveration. Input to the brain activates 
loops of neurons which may continue to fire 
one another even after the stimulus is re- 
moved, If they fire often enough, some sort 
of structural change occurs so that subse- 
quent firing of a few of the units will activate 
the entire circuit again. If ECS or drugs dis- 
rupt the original reverberating circuit, the 
structural change necessary for long-term 
memory will not take place. 
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Kaplan (1962) has suggested how certain 
neurochemical constructs may be incorporated 
in the Hebbian framework. A separate paper 
discussing the details of his approach is in 
preparation. A few points will be mentioned 
here: 

In the first place, it is postulated that the 
structural change which underlies memory is 
an increase in the amount of transmitter 
available at the cell endfeet which were active 
in the particular circuit involved. If any cell 
in the circuit is subsequently fired, more 
transmitter can be released from these end- 
feet and the next cell is more likely to be 
fired than it was before the original circuit 
was set up (although speaking of one cell 
firing the next is an oversimplification). This 
increased availability may be due to an in- 
crease in the amount of transmitter stored 
in that endfoot or an increase in the ease 
or speed with which the transmitter can be 
synthesized or released. Since it is possible 
that the brain cannot synthesize an indefi- 
nitely large amount of transmitter, building 
up the surplus of transmitter substance in 
certain endfeet of a cell might necessitate de- 
pleting the reserve in other endfeet. In molar 
terms, forming a new habit or memory may 
weaken an old one.? 

Another important neurochemical consider- 
ation is that the amount of transmitter sub- 
stance available may influence the intensity 
and/or duration of activity in a neural circuit 
which has just been stimulated. This in turn 
could be expected to affect the degree of 
consolidation or the strength of the resulting 
memory trace (whether or not the long-term 
memory mechanism postulated above is cor- 
rect). It might also affect the readiness with 


2In a review article published shortly after this 
Paper was written, Gaito and Zavala (1964) criti- 
cize the synaptic approach and support instead the 
position that intracellular changes in ribonucleic 
acid (RNA) are the basis of memory. However, the 
RNA theory has been subject to severe criticism 
(eg., Briggs & Kitto, 1962; Dingman & Sporn, 1964) 
and is certainly not well-enough established to 
warrant such ready abandonment of alternative 
or supplementary theories. One of the arguments put 
forth by Gaito and Zavala against the synaptic 
theory is that there is a lack of confirmatory evi- 
dence for cholinergic transmission in the brain, It is 
hoped that this paper may serve as a partial answer 
to that argument. 
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which the organism can turn its attention te 
new stimuli or new thoughts; for instance, 
with a small amount of transmitter (or 
equivalently, a large amount of the 
which breaks down the transmitter), a pres- 
ently reverberating neural circuit could not 
remain active for very long and a new one 
would become dominant. Thus, biochemical 
individuality may underlie individual differ- 
ences in memory and in attention span. 
The validity of the above statements does 
not depend on which chemical is actually the 
transmitter in the brain. However, if the 
identity of the transmitter is known, manipu- 
lations of it may be made (e.g., with drugs), 
and the biochemical features of the theory 
may become testable. This review was under- 
taken to find how much evidence there is for 
cholinergic transmission and what behavioral 
changes occur when the levels of ACh or | 
its hydrolyzing enzyme acetylcholinesterast 
(ChE) are manipulated. 


PHYSIOLOGICAL STUDIES OF BRAIN 
ACETYLCHOLINE 


In order to be considered as a possible 
transmitter, a substance must meet several 
criteria: (a) it must be present in and synthe- 
sized by brain tissue; (b) it must be released 
during neural activity; (c) it must have an 
excitatory action on the postsynaptic mem- 
brane; and (d) there must be an enzyme 
present to catalyze its breakdown, so that 
it does not keep the postsynaptic membrane 
depolarized indefinitely. Acetylcholine has 
been studied in the light of all four criteria- 


Synthesis by Neural Tissue 


ACh is known to be present in the brai 
and to be synthesized by nerve tissue 1 
vitro (McLennan, 1963). Much more ee 
is present in gray matter than In W i 
(Feldberg & Vogt, 1948), as would be ves 
pected if its action is synaptic. Its conc : 
tion is highest in the brain stem and cau E 
nucleus, lowest in the cerebellum, and 4 il 
mediate in cerebral cortex, pons, and medu 

uastel, 1962). 
sa is ae Oe from choline and ani 
coenzyme A in a reaction catalyzed by sid 
enzyme choline acetylase (Cha) (Mei 
1959). The distribution of ChA in the n 


system is roughly the same as that of 
(Quastel, 1962). When a nerve fiber is 
it is found that the ChA in the distal 
nent disappears in a few days, whereas 
ere is a transitory increase at the proximal 

of the cut (Eccles, 1961). This suggests 
l the ACh manufacturing system is pro- 
duced in the soma and travels down the axon. 
Tt could possibly be distributed to the various 
feet according to their individual demands. 
Mhe native ACh of the brain exists in 
ound form so that it is inert and free from 

uction, It can be released in vitro by 
ding acid, increasing the potassium or 

cium ion concentration, or stimulating the 
ti electrically (McIwain, 1959). The 
binding is thought not to be chemical because 
l is too easily broken (Stone, 1957). 
‘on micrographs of nerve endfeet show 
ige numbers of small spherical vesicles, 
and these have been suggested as possible 
ACh containers by Whittaker (1959), who 
found that fractions of brain tissue contain- 
Synaptic buttons filled with vesicles had 
much higher concentration of ACh than 
her fractions. It has been further suggested 
hat the mitochondria, which are abundant 
in the endfeet, load ACh into the vesicles or 
teak down into vesicles themselves, and elec- 
fon micrographs show all stages in a trans- 
Ormation of mitochondria into an assemblage 
vesicles surrounded by a membrane which 
ventually disintegrates and liberates the 
e à y into the presynaptic terminal (Eccles, 
Direct evidence has recently been found 
uJ the presence of ACh in vesicles, De 
bertis and his colleagues (de Robertis, 
Lores Arnaz, Salganicoff, de Iraldi, & 
ner, 1963) have developed a method for 
Parating the components of neural end- 
t from the brain into three subfractions, 
® of them containing only vesicles. 
i vesicle subfraction was found to 
‘Me site of the endfoot’s ACh; ChA was 
A Present. This demonstration strongly 
ssests a synaptic function for ACh in the 
ittal nervous system. 


tease during Neural Activity 


fo this reviewer’s knowledge, it has not 
l been possible to study the release of 
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chemicals from single neurons, so there 
direct evidence that ACh (or any other sub- 
stance) is released when a brain cell is fired, 


evidence, 
Richter and Crossland (1949) sacrificed 
rats in various physiological states by placing 


rats’ brains for ACh content and found that 
the rats sacrificed during sleep or pento- 
barbital anesthesia had significantly more 
bound ACh than normal rats (i.e, those in 
a relaxed, waking state), while rats killed 
during emotional excitement or convulsions 
had significantly less than normal, In other 
words, the level of bound ACh varied in- 
versely with the degree of activity of the 
brain, thus suggesting that ACh is released 
during such activity. 


The convulsions in the iment 
were induced by means of stimula- 
tion. Animals were sacrified at times 


between the onset of the stimulation and the 
termination of convulsions, It was found that 
there was a rapid drop in bound ACh when 
the current was passed through the brain, 
followed by a rapid recovery when the cur- 
rent ceased. Convulsions began after this re- 
covery had taken place, and they ceased 
when the ACh level had once more fallen to 
about 40% of normal. They can be started 
again by application of ACh»to the cerebral 
cortex (Hyde, Beckett, & Gellhorn, 1949). 
That ACh-release also accompanies human 
convulsions is suggested by the fact that ACh 
is found in the cerebrospinal fluid of pa- 
tients undergoing convulsions (Tower & Mc- 
Eachern, 1949). The findings of Richter and 
Crossland may explain the increased fre- 
quency of epileptic seizures during early 
stages of anesthesia and in sleep, when the 
availability of ACh is high. 

Another method of studying the liberation 
of chemicals in the brain is to implant small 
plastic cups in the skull, fill them with 
Ringer’s solution, and collect substances re- 
leased from the cortex in these cups. Using 
this technique, Gaddum (1961) collected 
ACh from the cerebrum of sheep and cats; 
he found that the rate of liberation was in- 
creased by electrical stimulation of various 
nerves. In a more extensive study, Mitchell _ 
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(1963) found that ACh is continuously re- 
leased from the surface of the cortex of anes- 
thesized sheep, cats, and rabbits, and that the 
rate of release is roughly proportional to 
the electrical activity of the brain. The deeper 
the anesthesia, the less ACh is released. Elec- 
trical stimulation of the cortex results in 
increased output of ACh in the ipsilateral 
primary somatosensory area. There is a con- 
comitant smaller increase in the contralateral 
somatosensory cortex; this increase (but not 
the former) is abolished when the corpus 
callosum is cut. 

Eccles (1961) has done a great deal of 
work on aspects of chemical transmission 
at neuromuscular junctions, with an emphasis 
on quantal release of ACh. (One quantum is 
the amount of ACh contained in one vesicle.) 
Since ACh-filled vesicles have been found at 
brain synapses, and since there is some evi- 
dence (McLennan, 1963) that prolonged 
stimulation results in a depletion of these 
vesicles, neuromuscular transmission mecha- 
nisms may offer some parallels to central 
mechanisms. 

Eccles found that muscle fibers exhibit 
miniature endplate potentials (EPPs) at 
random intervals. Except for their small size, 
these are like the EPPs which result from 
neuromuscular transmission, that is, they are 
prolonged by anticholinesterases and depressed 
by curare. The frequency of these miniature 
EPPs changes, but their size is constant; this 
fact led Eccles to suggest that they are due 
to individual quanta of ACh being released 
spontaneously from the presynaptic endfoot. 
Nerve impulses, then, generate EPPs by pro- 
ducing enormous increases in the rate of 
quantal emission of ACh. He postulated that 
the vesicles are sterically related to receptive 
sites on the inner surface of the presynaptic 
membrane, and that depolarization increases 
the number of these sites in some way. He 
further hypothesized that the vesicles are in 
a state of continual thermal agitation, so 
that those immediately next to the membrane 
may collide with the receptive sites, lock 
together with them, and eject ACh into the 
synaptic cleft. (No mention was made of 
what happens to the emptied vesicles.) The 
presynaptic impulse may have two effects: a 
brief (1 millisecond) jet of liberation of those 
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vesicles next to the membrane, and a pro 
longed (200 millisecond) mobilization of the 
rest of the vesicles so that they too can move 
to the front, so to speak. 

Since an increase in intracellular calcium 
is known to result in increased ACh output, 
Eccles suggested that the influx of calcium 
ions during depolarization may be responsible 
for the increase of receptive sites. If this is 
so, then posttetanic potentiation (and, simi- 
larly, postactivation potentiation in the cen- 
tral nervous system) might be explained by 
the fact that calcium is extruded slowly from 
the cell (Hodgkin & Keynes, 1957) and thus 
may exert its effect after the original stimulus 
has ceased. Such a calcium mechanism may 
be involved in short-term memory; that is, @ 
circuit in the brain which has just been 
reverberating may be reactivated easily be- 
cause the calcium has not yet been extruded 
from the cells involved (Kaplan, 1965). 


Effect on the Postsynaptic Membrane 


The effect of ACh on nerve cells may be 
studied indirectly by injecting ACh systemi- 
cally and observing the effects on behavior, 
EEG, and individual neuron activity. Recently, 
it has been studied directly by microinjection 
outside individual cells. Effects of cholino- 
mimetic and cholinolytic drugs may be studied 
in the same ways; they presumably affect 
endogenous ACh and ChE (the enzyme re- 
sponsible for the hydrolysis of ACh). The 
drugs most often used to block the action 
of ACh are atropine and acopolamine. Drugs 
which inhibit ChE activity (and thus facili- 
tate ACh) are diisopropylfluorophosphate 
(DFP), prostigmine, and physostigmine 
(eserine). 

Dosage is an important variable in these 
injection studies. While a moderate dose 0 
ACh may increase the efficiency of neura 
firing, a large dose may inhibit activity by 
keeping the cell membranes depolarized 50 
they cannot fire again. It has been reporte 
(Crossland, 1960) that a dose of ACh whi 
is stimulant to the quiescent cortex becomes 
inhibitory when the background activity ° 
the brain increases, : 

Administration of DFP to human subjects 
results in symptoms which indicate increas? 
central neural activity—excessive, Me 


CHOLINERGIC SYNAPTIC TRANSMISSION 


dreaming; insomnia; jitteriness and restless- 
ness; and sometimes visual hallucinations 
(Grob, Harvey, Langworthy, & Lilienthal, 
1947). Sherwood, Ridley, and McCulloch 
(1952) reported that normal animals injected 
with ACh exhibited agitation and apparent 
hallucinatory behavior. (“They seemed pre- 
occupied with an internal state, showing anger 
or fear in response to stimuli not perceived 
by observers.”) On the other hand, Feldberg 
and Sherwood (1954) found that injections 
of ACh or anticholinesterases into the lateral 
ventricles of cats produced catatonic-like 
stupors. In cats with catatonia induced by 
brain lesions, intravenous ACh resulted in a 
return or exaggeration of all catatonic signs 
while injected ChE produced transient remis- 
sion (Sherwood, Ridley, & McCulloch, 1952). 
The ACh in these cases had apparently 
reached the inhibitory level. At an intermedi- 
ate level, Rowntree, Nevin, and Wilson 
(1950) found that injections of DFP into 
normal human subjects resulted in slowness 
of thought without disturbance of orienta- 
tion, memory, or intellectual ability; this 
might be explained in terms of increased 
duration of perseveratory activity due to 
increased ACh function. 

The effects of DFP are diminished by 
atropine, suggesting that endogenous ACh is 
their mediator. Atropine by itself severely 
impairs recent memory, cuts the attention 
span to 15 seconds, renders the subjects un- 
able to do simple arithmetic problems, and 
produces a decrease in spontaneous speech 
and movement (Ostfeld, Machne, & Unna, 
1960). All these effects may be due to reduced 
brain activity resulting from reduced avail- 
ability of ACh. 

It has been found that injected ACh and 
anticholinesterases produce an activation syn- 
drome in the EEG—desynchronization in the 
neocortex; synchronization in the hippo- 
campus, thalamus, and midbrain reticular 
formation; and increased alerting responses 
to sensory and direct electrical stimulation. 
Anticholinergics have the opposite effect 
(Funderbunk & Case, 1951; Monnier & Ro- 
Manowski, 1962; Wescoe, Green, McNamara, 
& Krop, 1948), Hebb and Krnjevié (1962) 
Suggested that this may be due to vascular 
fects of the drugs or to stimulation of 
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sensory nerve endings which in turn activate 
the EEG, rather than to direct cholinergic 
mediation of activation. However, Rinaldi 
and Himwich (1955) found that intracarotid 
ACh does not affect the EEG in a cerebral 
hemisphere which is isolated from the re- 
ticular activating system but which has 
its blood supply intact, so vascular effects 
would seem not to be critical. They also 
found that ACh and anticholinesterases are 
effective in activating the EEG in a cerveau 
isolé preparation in which most sensory input 
is cut off. 

It thus seems probable that ACh is a trans- 
mitter in some subcortical structure involved 
in mediating activation. The reticular formation 
itself is apparently not the site of cho- 
linergic action. McLennan (1963) reported 
that iontophoretic application of ACh to indi- 
vidual neurons in the reticular formation had 
no effect on the activity of these neurons, 
Furthermore, the fact that ACh is effective in 
activating the EEG of a cerveau isolé (which 
is transected at the midbrain level) indicates 
that its action is at a more rostral site. 
Bradley and Key (1958) have suggested that 
ACh acts at the level of the diffuse thalamic 
projection system. This hypothesis is in line 
with Feldberg and Vogt’s (1948) report that 
the thalamus is relatively high in ACh. It 
might explain the finding of Bouzarth and 
Himwich (1952) that when injected DFP 
produced seizures in rabbits, the convulsive 
EEG activity appeared in the thalamus before 
spreading to the cortex. 

It has been noticed that the effects of 
cholinomimetic and cholinolytic drugs on the 
EEG are not always paralleled in behavior; 
that is, DFP can activate an animal’s EEG 
without producing behavioral alertness, and 
atropine can produce EEG sleep patterns 
without behavioral sleep (Bradley & Elkes, 
1957; Bradley & Key, 1958; Wikler, 1952). 
No one has explained this dissociation be- 
tween EEG and behavior. In human studies, 
the EEG effects are accompanied by corre- 
sponding behavioral changes (e.g., the studies 
by Ostfeld et al. and Grob et al., discussed 
above). 

Another way of studying the effects of ACh 
is to inject it systemically and record the 
changes in electrical activity of individual 
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brain cells, Marrazzi (1953) studied a mono- 
synaptic transcallosal path between the optic 
cortices, stimulating one cortex and recording 
in the other. He found that intracarotid ACh 
in moderate doses enhances the postsynaptic 
potential, but a double dose depresses the 
response, These effects are relieved by atro- 
pine. The unenhanced synaptic output can 
also be blocked by atropine, suggesting that 
there is endogenous ACh involved. DFP pro- 
duces the same effects as ACh, and these 
effects are also blocked by atropine. In an- 
other phase of this study, Marrazzi stimu- 
lated the optic nerve and recorded in the 
optic cortex. He found that the effects of 
ACh, DFP, and atropine in this case are 
the same as in the other. 

David and his co-workers (David, Mara- 
yama, Machne, & Unna, 1963) studied re- 
sponses of individual lateral geniculate cells, 
as evoked by orthodromic stimulation of the 
contralateral optic nerve or antidromic stimu- 
lation of the optic radiation. The postsynaptic 
component was enhanced by low intracarotid 
doses of ACh or eserine, and depressed by 
atropine or higher doses of ACh. These 
researchers concluded that the effect must 
be at the synapse since the presynaptic 
spike and the antidromic response remained 
unaltered. 

Curtis and Davis (1963) have questioned 
whether ACh is actually the transmitter at 
lateral geniculate synapses. They injected ACh 
into the vicinity of individual cells through 
a micropipette and recorded the cell’s re- 
sponse with microelectrodes. They found that 
ACh had an excitatory effect on all the cells 
but that its excitatory action had different 
characteristics from that at pre-Renshaw cell 
synapses, where transmission is known to be 
cholinergic, In the former case, responses have 
a greater latency and a longer duration. Fur- 
thermore, dihydro-8-erythroidine suppresses 
the excitatory action of ACh but not synaptic 
excitation due to stimulation of the pre- 
synaptic fiber; and 5-hydroxytryptamine has 
the opposite effect. It would seem that micro- 
injections of anticholinesterase or anticho- 
linergic drugs would give a better indication 
of whether or not there is endogenous ACh 
acting in synaptic transmission in the lateral 
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geniculate area. Feldberg and Vogt (1948) 
found a high concentration of ACh there, 

A number of other microinjection studies 
have been done. Pickford (1947) found that 
when ACh is injected in the supraoptic 
nucleus of the hypothalamus, there is as 
inhibition of the rate of urine flow in dogs, 
presumably due to stimulation of the supra- 
optic cells which in turn stimulate the pituh 
tary gland to secrete antidiuretic hormone. 
(The urine inhibition is not seen after re 
moval of the pituitary.) The inhibitory action 
of ACh on urine flow is prolonged by an 
injection of eserine; eserine injected by itself 
also results in urine flow inhibition. Since 
Feldberg and Vogt (1948) found that fibers 
converging on the supraoptic nucleus are 
rich in ACh, there is strong evidence for 
cholinergic transmission at these synapses. 

Other hypothalamic nuclei may also be 
cholinergic. Hernandez-Peén (1962) reported 
that an injection of ACh into the postero- 
medial hypothalamus results in deep sleep. 
Miller (1960) and Grossman (1962b) found 
that injections of ACh in the “feeding area” 
induce vigorous and prolonged drinking in 
food- and water-satiated rats. Atropine abol- 
ishes the drinking behavior. Injection of epi- 
nephrine and norepinephrine at the same 
locus in the same animals induces vigorous 
eating. ACh increases the water intake of 
normally thirsty rats but decreases the food 
intake of normally hungry rats; epinephrine 
has the opposite effect (Grossman, 19628). 
There thus appear to be two distinct chemical 
systems involved, in the same area if not at 
individual synapses. It will be interesting t0 
see if future work reveals such a dual m 
nism in other areas. j 

Curtis and Andersen (1962) studied single 
cells in the ventrobasal thalamus and found 
that almost all of them are stimulated by 
microinjections of ACh. However, because 
excitation has a greater latency and longer 
duration than that of Renshaw cells, they 
concluded that ACh is not the normal trans- 
mitter at these synapses unless the receptor 
sites are different from those of Renshaw 
cells. Again, injections of anticholinesterae 
and anticholinergins might provide help 
information. 

Krnjević and Phillis (1961, 1963) have 
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i individual cells in the cerebral cortex 
cats, rabbits, and rhesus monkeys. They 
Pound that a small proportion of these cells 
d 15%) are excited by ACh, The ex- 
is characterized by a slow onset and 
ged afterdischarge. The ACh-sensitive 
F s tend to occur in clusters and often 
show spontaneous activity related to the slow 
waves of the EEG. They are mainly in the 
Somatosensory, visual, and auditory areas. 
Spehlmann (1963), studying individual 
peurons in the visual cortex, found that spon- 
taneous and evoked firing of 20% of them 
fs increased by iontophoretically applied 
ACh. Neural discharges evoked by illumina- 
tion of the eye are facilitated by application 
of ACh during the light-activated phase in 
‘OM-neurons, the dark-activated phase in off- 
Meurons, and both phases in on-off neurons. 
Prostigmine enhances the effects of subse- 
“quently applied ACh. Without ACh injec- 
~ tions, the prostigmine also has activating and 
facilitating effects on cells known to be sensi- 
tive to ACh, so endogenous ACh is very likely 
to be involved at these synapses. 
ence of a Hydrolizing Enzyme 
The enzyme acetylcholinesterase (ChE), 
which is known to catalyze the hydrolysis of 
í is present in large quantities in the 
central nervous system. It is particularly 
abundant in areas of the brain which nor- 
mally have intense neural activity (Tower, 
958). It is concentrated at axonal endings 
er & Stone, 1962), probably along the 
tic membrane (de Robertis et al., 
). No ChE is present in subcortical white 
er (Geiger & Stone, 1962). Rosenzweig, 
, and Bennett (1958) reported that, in 
, ChE activity is much the same in 
the two hemispheres but varies significantly 
fom area to area within the hemisphere. 
Mhe motor cortex has higher ChE activity 
fan the somatosensory cortex, which in turn 
las higher activity than the visual area. They 
iSo found significant individual differences 
mean ChE activity. ChE activity increases 
ith age, up to about 100 days in rats, and 
nen declines; since brain weight increases 
Honotonically, there is no decline in total 
E activity (Bennett, Rosenzweig, Krech, 
tisson, Dye, & Ohlander, 1958). 
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Various studies have shown that ChE activ- 


ity changes in response to a known or pre- 
sumed change in ACh activity. Burkhalter 


(1962) found that rats raised in darkness 
have significantly lower ChE activity in the 
retina than normal rats, He cited evidence 


the blindfolded eye than in other, 
moval of one eye in frog larvae leads to a 
deficiency of ChE activity in the contra- 
lateral optic lobe (Boell, Greenfield, & Shen, 
1955). A high salt diet results in higher than 
normal ChE activity in the hypothalamic 
nucleus concerned with water excretion; 
lactation is accompanied by high ChE activ- 
ity in the related hypothalamic nucleus 
(Pepler & Pearse, 1957). Unilateral lesions 
in the visual and somesthetic cortex of rats 
are followed by a significant increase in 
the ChE activity in the contralateral hemi- 
sphere, perhaps due to a rerouting of brain 
activity (Krech, Rosenzweig, & Bennett, 
1960b). 

There is some evidence that diffusion of 
ACh from the synapse may be an important 
factor in the cessation of propagation (ChE 
still being necessary to hydrolyze the diffused 
ACh). In certain retinal synapses, one nerve 
ending is embedded in the next nerve cell 
so that a diffusion cannot occur rapidly; in 
this case, one presynaptic impulse triggers a 
prolonged burst of firing (McLennan, 1963). 
Additional Considerations 

There are other views of ACh function 
and of synaptic transmission which should 
be considered before any conclusions are 
reached: 

One school of thought, headed by Nach- 
mansohn (1962a, 1962b), holds that ACh is 
essential for axonal conduction rather than 
synaptic transmission. Nachmansohn’s hy- 
pothesis is that excitation of the cell mem- 
brane results in a dissociation of the binding 
of ACh. The free ACh then acts on a receptor 
protein which in turn changes the ionic 
conductance of the cell membrane, and bio- 
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electric currents are generated. The ACh is 
then hydrolized by ChE, the receptor protein 
and membrane conductance return to normal, 
and the currents cease. 

Several criticisms may be raised against 
this position, especially against Nachman- 
sohn’s apparent denial of chemical transmis- 
sion at the synapse: 

1. If ACh is universally essential for 
axonal conduction, its concentrations would 
be fairly uniform throughout the nervous sys- 
tem. This is not the case. There is almost no 
ACh and ChE in white matter, and concen- 
trations vary in different sections of gray 
matter. 

2. Microinjections of ACh outside the post- 
synaptic membrane of cells result in depolar- 
ization of the membrane, but injections inside 
the cell have no effect (Eccles, 1961). 

3. The existence of ACh-filled vesicles in 
the endfeet and of quantal miniature EPPs 
is hard to explain without positing cholinergic 
synaptic transmission. 

4. ACh is present in the perfusion fluid of 
recently active neurons. Nachmansohn says 
that this is due to the fact that ACh can leak 
out at the synapse but not along the axon; 
he found that if a fiber was cut before stimu- 
lation, ACh leaked out at the cut surface. He 
does not suggest why ACh should leak so 
regularly at synapses. 

Nachmansohn presented some evidence 
which is not easily explained by the synaptic 
transmission model alone—for example, that 
ACh and vesicles may be present on the 
postsynaptic side of the junction. However, 
this reviewer does not feel that Nachmansohn 
has in any way disproved chemical trans- 
mission at the synapse. His proposed mecha- 
nism may supplement the synaptic transmis- 
sion mechanism rather than replace it. The 
synaptic mechanism has been stressed in this 
paper, partly because the evidence for it 
appears stronger, and partly because the 
synapse is a place where integration may 
occur and where the changes underlying 
memory may take place. Axonal conduction, 
with its all-or-none property, would seem to 
be of less basic significance for psychological 
processes. 

It should be pointed out that ACh is likely 
to have functions other than synaptic trans- 
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mission, as it is found in some one-celled 
organisms and in nonnervous structures (such 
as the spleen and placenta) in higher organ- 
isms (Florey, 1961; C. O. Hebb, 1957). 

It should also be emphasized that ACh is 
not considered to be the only transmitter in 
the central nervous system. Inhibition in the 
CNS (by hyperpolarization of the sub- 
synaptic membrane) is well-established; the 
existence of one or more inhibitory trans- 
mitters is assumed, though their identity has 
not as yet been ascertained. (For reviews on 
possible inhibitory transmitters, see Florey, 
1961; McLennan, 1963; Roberts, 1960.) 

It is likely that other excitatory transmit- 
ters also exist. An adrenergic mechanism is 
often suggested since noradrenalin is a trans- 
mitter in the sympathetic nervous system. 
Adrenalin and noradrenalin exist in small 
quantities in the brain, mostly in the hypo- 
thalamus, a structure concerned with regula- 
tion of sympathetic activity (Vogt, 1954). 
The studies by Grossman and Miller dis- 
cussed above suggest an adrenergic mecha- 
nism in the feeding center of the hypo 
thalamus. Some adrenalin and noradrenalin 
are also present in the midbrain reticular 
formation, where they may play a role m 
mediating arousal. Tests of this hypothesis 
have led to many conflicting results (see the 
review by Longo, 1962), and adrenergic trans- 
mission in the midbrain is far from being 
well-established. H 

Serotonin (5-hydroxytryptamine) 1s also 
often nominated as a transmitter. It is pres- 
ent mainly in the hypothalamus, amygdala, 
and limbic cortex (McIlwain, 1959), areas 
concerned with emotional behavior. It has 
been of particular interest because the hal- 
lucinogen LSD-25 is known to antagonize B 
action, However, some hallucinogens are n0 
antagonists of serotonin (McLennan, 1963); 
and some antagonists are not hallucinogene. 
Conflicting results were also reported for the 
action of serotonin on the EEG o 
1962). For a review on serotonin, the rea 
is referred to Page (1958). 0% 

Since ACh apparently excites only ioe iy 
of cortical neurons, and since it is not like! 
that the other 80-85% are all inhibit 
there must be some other transmitter(s) 0 
the cortex. If so, their identity remains 
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known. Serotonin, adrenalin, and noradrenalin 
are not present in the cortex in appreciable 
amounts. 

ACh thus remains the most widespread and 
abundant of all the suggested transmitters, 
both in the cortex and in subcortical struc- 
tures; even in the hypothalamus, its concen- 
tration is greater than that of serotonin, 
adrenalin, and noradrenalin (Crossland, 
1960). ACh also meets the criteria put forth 
earlier better than any other contender. It 
therefore seems fair to conclude that ACh 
is an important transmitter substance in the 
brain—especially in the cortex, diffuse 
thalamic projection system, hypothalamus, 
and lateral geniculate. The remainder of this 
paper is devoted to a discussion of what rele- 
vance this conclusion (especially as consid- 
ered in its theoretical framework) may have 
for psychologists.? 


CHOLINERGIC MANIPULATIONS IN 
BEHAVIORAL RESEARCH 


In the earliest experiment involving the 
effects of cholinergic manipulations on be- 
havior, Essig and his colleagues (Essig, 
Hampson, McCauley, & Himwich, 1950) 
found that intracarotid injections of DFP 
in cats, dogs, rabbits, and monkeys induce 
contraversive circling (i.e., turning away from 
the injected side), unless the dose is too high, 
in which case convulsions occur. In all ani- 
mals that circle, ChE activity is found to be 
lower in the cortex and caudate nucleus on 
the injected side than on the other. ACh 
would thus be dominant in that hemisphere 
and could be expected to stimulate muscle 
contractions in the contralateral side of the 
body, inducing circling. In the convulsed ani- 
mals, ChE activity is very low in both hemi- 
spheres, Injections of atropine and scopola- 
Mine eliminate the circling and convulsions. 

Aprison, Nathan, and Himwich (1954, 


= The material to be reviewed here lies mainly in 
realm of experimental psychology. There is evi- 
dence that a faulty cholinergic mechanism may also 
© somehow involved in various clinical syndromes, 
Gi an adequate treatment of these problem areas 
i eyond the present scope of this writer. The in- 
crested reader is referred to papers on the roles of 
( ie in epilepsy (MclIlwain, 1959), Parkinsonism 
fe & Marshall, 1962), psychosis (Rubin, 1962), 
reading disability (Smith & Carrigan, 1959). 
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1956) obtained similar results and also found 
that animals whose ChE activity is particu- 
larly low on the injected side circle toward 
that side; they suggested that in this case 
ACh activity was high enough to be inhibi- 
tory, so the other hemisphere gained control 
of behavior. Injections of ACh itself produced 
the same effects. 

White (1956) found that persistent contra- 
versive turning results from injections of DFP 
directly into the caudate nucleus in doses 
which decrease ChE activity there to 20-40% 
of normal. Local administration of atropine 
prevented or stopped this circling, thus fur- 
ther implicating an underlying cholinergic 
mechanism. 

A large part of the behavioral research in 
the area of brain chemistry has been done 
by Krech, Rosenzweig, and Bennett, and their 
students (see the 1960 review by Rosen- 
zweig et al.). They ran rats in an unsolvable 
Krech Hypothesis Maze. The animals were 
said to have a “spatial hypothesis” if they 
went to the left or to the right regardless 
of lighting, and a “visual hypothesis” if they 
went to the dark door or the illuminated door 
regardless of side. It was found that the 
animals generally had visual hypotheses at 
first but that some shifted to spatial hypothe- 
ses. This shift was considered adaptive be- 
cause it indicated an ability to ignore the 
dominant (visual) stimulus if necessary, and 
to try a new technique when the old one 
is unsuccessful. The animals which adopted 
spatial hypotheses were thus considered 
“maze-bright” animals; the others, “maze- 
dull.” Strains of maze-bright (Sı) and maze- 
dull (Sz) rats have been bred for many years 
(Tryon, 1940) and tested in a variety of 
situations. It has been found (Rosenzweig 
et al., 1960) that if the hypothesis maze is 
made progressively solvable, the Sıs make 
more rapid shifts in hypothesis when neces- 
sary than do the Sys. If the maze is made 
problemless by removing all doors, the Sis 
show no tendency to shift to spatial hypothe- 
ses (Peirce, 1959). The Sıs tend to perform 
better in other situations also; they made 
significantly fewer errors than the Sss ina 
Lashley III maze, a Hebb-Williams maze, 
and a Dashiell maze (Rosenzweig et al., 
1960). The Sys are also reported to have 


330 


learned a vertical-horizontal discrimination 
problem significantly faster than the Sss 
(Fehmi & McGaugh, 1961). 

In a chemical analysis of the rats’ brains, it 
was found that the S, rats had significantly 
higher ChE activity than the Ss. It was at 
first assumed that high-ChE animals would 
also be high in ACh, and that, within limits, 
the higher the level of these chemicals in the 
brain, the more efficient the neural transmis- 
sion and the more intelligent (adaptive) the 
animals. However, when rats were bred spe- 
cifically for ChE (Roderick, 1960), the re- 
sulting high-ChE animals did not choose 
spatial hypotheses as regularly as the Sis. 
Subsequent measurement of ACh indicated 
that ACh and ChE levels are genetically inde- 
pendent. Maze-bright animals do have sig- 
nificantly more ACh than maze-dulls, but rats 
bred specifically for ChE show varying levels 
of ACh. 

A clue as to the basic functional difference 
between the Sı and S3 animals was provided 
by an experiment carried out by McGaugh, 
Jennings, and Thomson (1962). Noting that 
previous experiments had always employed 
massed trials, they decided to use a variety 
of intertrial intervals—30 seconds (massed), 
5 minutes, 30 minutes, and 1 day—to see if 
the strain differences held upon under all 
these conditions. The result was that the Sı 
rats learned the maze faster than the Ss 
only when they were trained with a 30- 
second intertrial interval; there were no sig- 
nificant differences at other intervals. This 
suggests that there is a difference in con- 
solidation rate beween the two strains, so that 
the increment in memory (habit strength) 
which occurs within 30 seconds after a trial 
is greater for the S,s than for the Sss. 

A study by Thomson and his co-workers 
(Thomson, McGaugh, Smith, Hudspeth, & 
Westbrook, 1961) gives further support to 
the consolidation-time hypothesis. They 
trained Sı and Ss rats in a Lashley III maze, 
with one trial per day. Electroconvulsive 
shock was given either 45 seconds, 1 minute 
15 seconds, 5 minutes, 30 minutes, or 2 hours 
after each trial. Control animals were given 
no ECS. In the Sys, only the 45-second ECS 
produced an appreciable decrease in learning 
tate. In the Sss, on the other hand, ECS 
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within 30 minutes after the trial lowered the 
learning rate significantly. (It is rather sur- 
prising to think that a process which may 
effectively occur within 1 minute in some 
animals may take as long as 30 minutes in 
others.) 

In a series of experiments (Breen & Me- 
Gaugh, 1961; McGaugh, 1959, 1961; Me 
Gaugh & Thomson, 1962; McGaugh, Thom- 
son, Westbrook, & Hudspeth, 1962; McGaugh, 
Westbrook, & Burt, 1961; McGaugh, West- 
brook, & Thomson, 1962), McGaugh and his 
co-workers have found that rats injected with 
CNS stimulants (strychnine, Picrotoxin, and 
1757 I.S.) learn Lashley III mazes more rap- 
idly than noninjected animals. It is signifi- 
cant that if these injections were made after 
the trials (within 1 minute), the facilitation 
still occurred; this suggests that the drugs 
affect the consolidation of the neural trace set 
up on the trial. Generally, the learning of the 
Sss was facilitated more than that of the Sis, 
as would be expected in view of the interpre- 
tation presented here. Similar work with spe- 
cifically cholinomimetic and cholinolytic drugs 
and with specifically high-ChE and low-ChE 
animals, as well as Sıs and S3s, may be €x- 
pected to shed additional light on the mecha- 
nisms involved. : 

On the basis of evidence that ChE activ- 
ity may increase in response to an increase 
in neural activity, Krech, Rosenzweig, and 
Bennett (1960a) hypothesized that ani 
raised in a rich environment would have 
higher ChE activity than normal laboratory 
rats or rats raised in isolation. They ra 
animals in three conditions: (a) environ 
mental complexity and training (ECT), raised 
in very large group cages filled with “toys, 
and given daily training sessions in a 
mazes, (b) social control (SC), raised 2 
group cages without toys, and (c) isola p 
control (IC). It was found that the ECT ra 
had significantly greater subcortical © 
activity than the ICs, with the SCs al 
in between, Surprisingly, the ECTs had low ‘ 
ChE activity per unit of cortical tissue; how 
ever, it was subsequently found (Rosena i 
Krech, Bennett, & Diamond, 1962) that il 
cortices of the ECTs weighed significan y 
more than those of the ICs, so that the to 
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cortical ChE activity was greater than the 
ECTs. 

A series of related experiments have been 
done. Rats raised in a rich environment (but 
without training) were found to perform sig- 
nificantly better on reversal discrimination 
problems than rats raised in isolation (Krech, 
Rosenzweig, & Bennett, 1962); this suggests 
a similarity between the ECs and Ss. It was 
found (Krech, Rosenzweig, & Bennett, 1960a) 
that neither amount of handling nor motor 
activity alone is the crucial component of 
the ECT condition. Zolman and Morimoto 
(1962) found that rats raised in isolation for 
8 weeks and then placed in an ECT condition 
for 4 weeks did not show the changes in ChE 
activity that the original ECT animals 
showed. They also reported that if animals 
raised in ECT were subsequently isolated, the 
chemical effects of ECT were dissipated. Ani- 
mals raised in isolation for 5 weeks and then 
in ECT for 7 weeks had ChE activity inter- 
mediate between that of animals raised for 
the whole period in either isolation or ECT 
(Rosenzweig, Krech, Bennett, & Zolman, 
1962). There is thus some suggestion of 
“critical periods” in brain chemistry. 

Thiessen, Zolman, and Rogers (1962) 
raised rats in five conditions: small solid- 
walled individual cages; large, solid-walled 
individual cages; small mesh cages within a 
large group cage; a large group cage; and a 
large group cage full of toys plus a daily hour 
of free play in an open field. After 4 weeks, 
animals in the fifth group (richest environ- 
ment) were found to have significantly higher 
subcortical ChE activity and adrenal weights 
than the others. Adrenal weight was posi- 
tively correlated with ChE activity in all 
groups. There were no appreciable differences 
n cortical ChE. It may be, then, that the 
Major effects of early environmental differ- 
ences are due not to changes in sensorimotor 
Circuits in the cortex but to changes in auto- 
nomic and arousal mechanisms. Further stud- 
les of these physiological and biochemical 
changes will hopefully provide a key for ex- 
Plaining the behavioral changes reported in 
the vast and often conflicting literature on 
effects of early environmental differences. 

Russell (1958, 1960) maintained chroni- 

Y reduced ChE levels in rats by putting 
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“Systox,” an organophosphorus anticholinest- 
erase, in their food. He reported that behavior 
passed through several stages as ChE activity 
was progressively reduced: When activity was 
60-100% of normal, no significant behavioral 
effects were observed. There was a suggestion 
of heightened behavioral efficiency when ChE 
activity was 40-60% of normal. Below 40%, 
there was a rapid loss of efficiency, followed 
finally by convulsions and death. Animals in 
the 20-40% range were reported to be, 


slower in eliminating responses which had previously 
been learned but were no longer adequate, less effi- 
cient in serial problem-solving, and less efficient in 
adjusting to stresses imposed by the environment. In 
other behavior patterns—including locomotion, sim- 
ple learning, instrumental conditioning, and visual 
discriminations—their performances did not differ 
significantly from those of control animals [Russell, 
1960, p. 28]. 


Specifically, it was found (Russell, Watson, & 
Frankenhaeuser, 1961) that rats in the 20- 
40% range learned a shock avoidance as 
readily as animals in the 40-100% range, but 
the former were much slower in extinguishing. 

Carlton (1963) postulated that a cholinergic 
system in the brain antagonizes a second sys- 
tem which activates behavior, and predicted 
therefore that increased activation and de- 
creased cholinergic activity should produce 
qualitatively similar effects. He found that 
amphetamine (which presumably stimulates 
the activating system) increased the response 
rate of rats in an operant shock-avoidance 
situation, and that atropine and scopolamine 
(anticholinergics) augmented the action of 
amphetamine (Carlton, 1961a). 

He further postulated that “the level of ac- 
tivation could be viewed as controlling the 
tendency for all responses to occur, whereas 
an inhibitory cholinergic system would act to 
antagonize this action on nonreinforced re- 
sponses [1963, p. 27].” In other words, when- 
ever an animal learns by eliminating the 
wrong responses, a cholinergic mechanism is 
responsible for the elimination. In experiments 
in which rats learned to barpress at fixed 
intervals (i.e., to inhibit their responses dur- 
ing certain intervals), anticholinergic drugs 
broke down the temporal patterning; the ani- 
mals began to respond more or less steadily 
(Boren & Navarro, 1959; Carlton, 1961b, 
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1963; Hernnstein, 1958). In animals which 
had learned to press one of several bars, 
anticholinergic drugs tended to increase re- 
sponding on the other bars (Carlton, 1963). 
Furthermore, habits which had been extin- 
guished reappeared when the animals were 
given anticholinergic drugs, and did not ex- 
tinguish again after thousands of unreinforced 
trials. Any time the drug injections were 
discontinued, the animals’ response rate fell 
to its preinjection, extinction level. This would 
suggest that extinction involves inhibiting a 
response rather than “erasing” it. 

Carlton’s activation mechanism may be 
nonspecific arousal as mediated by the reticu- 
lar activating system. Amphetamine is thought 
to stimulate this system (Longo, 1962), prob- 
ably at the level of the midbrain reticular 
formation, The cholinergic mechanism may 
be primarily cortical, involving specific stimu- 
lus-response connections (circuits), with some 
units activating inhibitory cells, if necessary, 
to block competing responses (ACh itself is 
not an inhibitory transmitter). Reducing ACh 
would reduce the ability of such a circuit to 
be active or at least to be dominant. On the 
other hand, if the thalamus is in some way 
responsible for focusing attention or channel- 
ing activation into certain sensory or response 
systems, ACh may be the mediator at this 
level. Perhaps microinjection studies (with 
chronic implanted micropipettes) will help 
clarify the primary site of cholinergic action. 


SUMMARY AND CONCLUSIONS 


The evidence reviewed indicates that of all 
the proposed candidates, acetylcholine (ACh) 
best meets the essential criteria for a central 
synaptic transmitter: (a) that it be present in 
and synthesized by brain tissue, (b) that it 
be released during neural activity, (c) that it 
have an excitatory action on the postsynaptic 
membrane, and (d) that there be an enzyme 
present to catalyze its breakdown. (This en- 
zyme in the case of ACh is acetylcholinester- 
ase, or ChE.) Cholinergic transmission is es- 
pecially likely in the cortex, hypothalamus, 
diffuse thalamic projection system, and lateral 
geniculate. There are almost certainly other 
excitatory transmitters, but their identity is 
unknown. 


A theoretical framework in which chemical 
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transmitters are important was discussed, It 
involves reverberation and consolidation of 
neural circuits. The amount of transmitter 
substance available is thought to play an 
important part in determining the intensity 
and duration of the active trace. It is also 
postulated that the change which underlies 
memory is an increase in the availability of 
the transmitter at those cell endfeet which 
were active in the particular circuit involved. 
Behavioral studies in which ACh or ChE 
levels were manipulated genetically or with 
drugs were reviewed. Animals which are high 
in ACh and ChE learn more rapidly than 
animals low in ACh and ChE when trials are 
massed but not when they are distributed; 
this suggests that the main functional differ- 
ence is in rate of consolidation. Animals 
raised in rich environments differ significantly 
in brain ChE activity from animals raised in 
isolation. Moderate reductions in ChE activity 
may increase behavioral efficiency; large re- 
ductions decrease efficiency. Blocking ACh 
function disrupts the temporal patterning of 
learned responses (e.g., in fixed interval situ- 
ations) and may prevent extinction. 
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A reanalysis, which partitions the proportions of variance from the various 
main sources and their interactions, of previously reported data from the S-R 
Inventory of Anxiousness and from a new sample, suggests that the debate 
over the relative importance of individual differences and of situations is largely 
a pseudoissue. While it is true that the mean squares for situations are regularly 
much larger than the mean squares for individual differences, a partitioning of 
the variance shows, at least for the trait of anxiousness, that each of these 
main sources contributed only about 5% of the total variation (sum of the 
component variances), modes of response about 25%, and that nearly } of the 
variance comes from simple interactions (Subjects with Situations about 10%, 
Subjects with Modes of Response about 11%, and Situations with Modes of 
Response about 7%). These proportions are highly stable across samples of Ss. 
The fact that such a substantial portion of the total variance comes from 
interactions confirms the suggestion that personality description might be 
improved by emphasing what kinds of responses individuals make with what 
intensity in various kinds of situations. 


Whether the major source of variance in 
behavior derives from the situation or the 
person is an important recurrent issue be- 
tween social psychologists and personologists. 
Social psychologists, especially those influ- 
enced by Cooley (1902) and George Herbert 
Mead (1934), have contended that the major 
source of behavioral variation resides in the 
situations in which individuals act (see Cot- 
trell, 1942a, 1942b; Dewey & Humber, 1951; 
Lindesmith & Straus, 1949) and derives from 
the definitions or meanings these situations 
have for individuals in terms of cultural rules 
and the roles they call forth. Personologists 
(Cattell, 1946, 1950; Cattell & Scheier, 1961; 
McClelland, 1951; Murray, 1938; but except- 
ing Allport, 1937, 1962) and clinicians (see 
Rapaport, Gill, & Schafer, 1945) have as- 
sumed that individual differences and their 
dynamic sources within individuals are es- 
sentially consistent across situations and 
thereby constitute the major source of behav- 
ioral variation. 


1 This study was supported in part by United 
States Public Health Service grants (MH-K-6-18,567 
and MH-08987) and in part by a grant from York 
University, Toronto. The authors wish to thank L. J. 
Cronbach, Goldine C. Gleser, D. W. Fiske, and John 
Gaito for their helpful comments and suggestions. 
The assistance of W. J. Jobst and Mrs. P. Offman 
with respect to some of the statistical analyses is 
also acknowledged. 


Analyzing this issue, Hunt (1959, 1963) 
has pointed out that those emphasizing indi- 
vidual differences have found that interjudge 
coefficients of reliability and validity coefii- 
cients of tests assessing personality traits 
typically fall between .20 and .50. Using the 
square of these coefficients as an estimate of 
the proportion of variance contributed by per- 
son differences suggests that these proportions 
are limited to between 4% and 25% of the 
total. A range of proportions so small 1s 
hardly consonant with the notion of consist- 
ent across-situation differences among persons 
as the major source of behavioral variance. 

What has been needed, however, is a direct 
comparison of the relative sizes of the contri- 
butions to the total variance in behavior, oF 
reports of it, from persons (individual dif- 
ferences) and from situations for various indi- 
cator responses. Such comparison is neede 
for several such traits as proneness to anxi- 
ety, to haste, to hostility, etc., since the sizes 
of the variance from the various sources for 
one trait may not hold precisely for another. 
Anxiety is a widely investigated trait in which 
persons are commonly presumed to vary COn- 
sistently across situations. Using the S-R I- 
ventory of Anxiousness, Endler, Hunt, and 
Rosenstein (1962) have attempted such 4 
test for this trait of anxiousness in terms ° 
what their subjects reported they would do, 
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feel, or show in various situations of various 
kinds. 

The format of the S-R Inventory of Anx- 
jousness had its origin in a logical analysis of 
the meaning of trait ratings, an analysis on 
which Hunt has lectured to his students for 
years (see Hunt, 1959). This format samples, 
separately, modes of response, situations, and 
persons. The form of the S-R Inventory em- 
ployed by Endler, Hunt, and Rosenstein 
(1962) samples 14 reported modes of response 
in each of 11 situations with each person or 
subject. Each mode of response is reported 
for each of the situations; thus, the total 
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number of items is 154. The form of the In- 
ventory also employs a five-step scale for 
intensity of response ranging from “none” to 
“very much.” The subject is asked to report 
the intensity of his response (physiological re- 
action, feeling, direction of response, or effect 
on action in progress) for each situation, All 
scales have the same direction, so that for 
each item, a high score (5) indicates a high 
level of anxiety and a low score (1) indicates 
a low level of anxiety. The format gives a 
page to each of the 11 situations, and the 14 
modes of response are listed below the situ- 
ation as follows: 


“You are just starting off on a long automobile trip.” 


1. Heart beats faster 


Not at all Much faster 
2. Get an uneasy feeling 2°3°4°S 

None Very strongly 
3. Emotions disrupt action Pe 2rs A 

Not at all Very disruptive 
6. Perspire 234 5 

Not at all Perspire much 


7...14. [See Endler, Hunt, & Rosenstein, 1962.] 


Using a three-way analysis of variance of 
these reported intensities for the various 
modes of response, Endler, Hunt, and Rosen- 
stein (1962) found that for a sample of 67 
students at the University of Illinois selected 
for the extremes of anxiousness, the sampling 
of situations nevertheless yielded a mean 
square (158) which was 3.8 times the mean 
square from this sampling of subjects (40), 
and that for a random sample of 169 fresh- 
men at Pennsylvania State University, the 
ratio of the mean square from situations 
(244) to the mean square from subjects (21) 
was 11.49, Although they also noted that the 
Mean square for modes of response was con- 
siderably greater than the mean square for 
either situations or subjects and that the 
Mean square for the interaction between 
Modes of response and situations was nearly 
as large as that for persons (individual dif- 
ferences), they emphasized the comparison 
between the variance from situations and the 
variance from subjects (Endler et al., 1962, 
Pp. 9-14). The fact that the variance from situ- 
ations was substantially greater than that 


from individual differences clearly appeared 
to indicate that social psychologists have been 
more nearly correct than personologists. 

Like many disputes in the history of sci- 
ence, however, this one between social psy- 
chologists and personologists over whether 
the main source of variation in behavior is in 
situations or in persons turns out to be a 
pseudoissue, the discovery of which leads to 
new insight with import for behavioral pre- 
diction and for personal assessment. In the 
light of further analysis, the behavioral vari- 
ance in reports of anxiety-indicating responses 
turns out to be due primarily neither to situ- 
ations nor to persons, Although a comparison 
of mean squares for situations and for per- 
sons or subjects may have had a certain sur- 
prise value in the Endler, Hunt, and Rosen- 
stein (1962) communication of their results, 
actually, the mean square for the situational 
source is a composite of variance from situa- 
tions per se, from the interaction of situations 
with subjects, from the interaction of situa- 
tions with modes of response, from the triple 
interaction, and from error. The mean square 
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for subjects is a similar composite, as is also 
the mean square for modes of response. Be- 
cause the mean square for each main con- 
tributor is a composite of a number of con- 
tributors to the variance, the logic of such 
comparing of mean squares is highly dubious.* 
The mean squares describe the variance 
among mean scores (i.e., over individual per- 
sons and over modes of response, for example, 
in the case of the mean square for situations) 
or among total scores (i.e., over situations and 
over modes of responses in the case of the 
mean square for individual differences). Ac- 
tually, each component contributes something 
to the variance of each item score, and an 
analysis of the relative proportions of variance 
from each component should be based on an 
analysis of the total variation among the spe- 
cific individual-item scores. It follows that, in 
order to discuss the relative proportion of 
variance contributed by each main component, 
it is necessary to determine the relative mag- 
nitude of variance from each component. This 
is not to be derived directly from the mean 
squares but should rather come from specifica- 
tion equations. These specification equations 
are expectations of mean squares and are 
equated to sums of relevant components of 
variance, all properly weighted. We can then 
solve for each component source of variance. 

Various statistical innovators have sug- 
gested procedures for partitioning the com- 
ponents of variance (see Bolles & Messick, 
1958; Cornfield & Tukey, 1956; Endler, 
1966; Endler & Jobst, 1964; Gaito, 1960; 
Lindquist, 1953; Scheffé, 1959). 

This paper presents a reanalysis of the data 
reported by Endler, Hunt, and Rosenstein 
(1962). The reanalysis determines the rela- 


2Various investigators have compared mean 
squares in terms of their relative magnitude. That 
is, they compare the mean squares in terms of their 
relative magnitude and state, for example, that in a 
particular study the mean square for one source is 
greater than that for another source. These relative 
magnitudes do not, however, imply relative propor- 
tions of variance, since the sum of all the mean 
squares is greater than the total variance. This is 
because some components (residual, for example) 
may contribute to all the mean squares. For descrip- 
tive purposes (and limiting ourselves to the data at 
hand), a comparison of mean squares may be valid. 
For purposes of estimating the proportions of total 
variance from the various sources, it is invalid. 
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tive contributions to the total vark 
subjects, situations, and 
and from all the various inte 
sists of comparing estimated © 
variance rather than either mean sq 
sums of squares. In order to add in 
concerning the generality of the co 
across samples of subjects, more 
lar analysis of data from one new 
subjects is included. 


METHOD 
Subjects and Procedure 


The subjects in all three samples 
the S-R Inventory of Anxiousness along 
other measures not pertinent to the pr 
report. The two samples of students deser 
by Endler, Hunt, and Rosenstein (1962 
determined by considerations in other i 
gations, The Illinois sample included, in € 
proportions, those who scored among the 
15% and among the lowest 15% of stud 
on the Mandler-Sarason (1952) Test Anx 
Questionnaire. These subjects were 
by Rosenstein (1960) for another st 
Penn State students were not preselected 
some of them were used in another sti 


the sample from Penn State presumably 
proximated a random sample of freshmen i 
large state-supported university. The th 
sample consisted of 53 male and female $ 
dents in an elementary psychology < 
York University and was presumably fā 
representative of college students in a si 
liberal arts college in Canada. 


Statistical Manipulation 


The relative contributions of varia 
situations, subjects, modes of respo : 
way interactions, and residual (compose 
the triple interaction and error or Wi 
variance combined because the triple 
action cannot be separated from e 
only one measure of each response | 
subject in each situation) were as 
means of a three-way analysis of va 
suming a random-effects model.? 


3 Some of our colleagues have questioned 
of a random-effects model rather than @ 
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cedure used to partition variance from these 
various sources for the three samples is the 
one reported by Gleser, Cronbach, and 
Rajaratnam (1965), by Endler and Jobst 
(1964), and by Endler (1966). 

For a three-way analysis of variance— 
situations (Sit), subjects (S), and modes of 
response (M-R)—using a  random-effects 
model, the mean squares have the following 
expectations, in terms of the various com- 
ponent sources of variance (note that p = 
persons or subjects; 4 = situations; j = modes 
of response; r = residual and is equal to the 
triple interaction plus error or within vari- 
ance; and n, $, and m = the numbers of sub- 
jects, situations, and modes of response, re- 
spectively): 


E(MS,) = o? + kmo + moy2?+ kos? [1] 


E(MS,) = a? + nmo? + moy?+ nog? [2] 
E(MS,) =o? + nko? + ko? +no [3] 
E(MS,:) = at + may? [4] 
E(MSy)) = of + hoy? [5] 
E(MSy) = of + no? (6) 
EMS) =e? [7] 


effects model. The latter may appear to be more 
appropriate since the situations and the modes of 
tesponse in the S-R Inventory of Anxiousness are 
not random samples of all possible situations or of 
all possible modes of response. The S-R format for 
inventories, however, has one of its chief advan- 
tages in separate, and potentially random, sampling 
of situations and of modes of response, as well as of 
Subjects. It may be noted that in the monograph 
(Endler, Hunt, & Rosenstein, 1962) no claim was 
made for statistical generality. Generality was to 
derive from the empirical reproducibility of the 
findings not only across samplings of situations, of 
modes of response, and of subjects, but also across 
traits. With a random-effects model, we could 
determine all the components of variance if each 
Subject were asked to report each mode of response 
More than once in each situation. With the mixed- 
effects model, the only way we could solve for all 
the components would be by assuming the triple 
interaction to be zero. Since the triple interaction 
should be psychologically meaningful, we believe it 
is less hazardous to assume a random-effects model 

a mixed-effects model. At any rate, the com- 
Ponents of variance are comparable for random- and 
mixed-effects models (see Endler, 1966, for a more 
detailed discussion of the choice of model when solv- 

for components of variance). 


Resvucts 

Table 1 presents degrees of freedom (, 

and mean squares (MS) from 
analyses of variance for the Illinois, 
State, and York samples, respectively. 
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cussed separately because composing the sam- 
ple of the top 15% and the bottom 15% of 
students preselected on another measure of 
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TABLE 1 
Axarysis or Vartance (Raxpom-Errects MODEL) OP REPORTED RESPONSES TO SITUATIONS IN THE 


S-R INVENTORY OF ÅNXIOUSNESS FROM ILLINOIS, PENN STATE, AND York Sauries 


Source 


Subject (S) (p) 
Situation (Sit) (é) 
Mode of Response (M-R) 


G) 
SX Sit (pi) 
SX MR ($j) 
Sit X M-R (ij) 
Residual (r) 


and Rosenstein, 1962, Table"3._p. 
the TAQ (Mandler & 1952) 


* From Endier, Hunt, 

Hee ine potina Hunt, and Rocerateln, 1962, Table 4, Do 10. 
anxiety tends to exaggerate the variance from 
individual differences among subjects. 

In spite of the exaggeration of the variance 
from subjects, the Illinois columns of Table 
2 show the estimated proportion of the total 
variation from subjects to be only 10.42%. 
This exaggerated estimate of the proportion 
from subjects in this reanalysis is slightly 
larger than that from situations (i.e., 7.29%). 
Both, however, are relatively small propor- 
tions of the total variation (sum of the com- 
ponents of variances). The proportion deriv- 
ing from modes of response for this Illinois 
sample is estimated at 19.53%, but, as pointed 
out by Endler et al., the large size of this 
proportion is a trivial finding inasmuch as 


te | York 
= 169) (N = S53) 

MS ds | us 
21.26 9 an 
244.37 | 10 | 9066 
836.51 | 13 257.12 
3.16 520 3.18 
2.86 676 | 25 
20.62 130 | 7u 
0.66 6.70 0.64 


10, This sample was composed of students scoring in either the 


one would expect subjects to “get an uneasy 
feeling” often, even though they might very 
seldom experience “having loose 4 
More important theoretically is the fact that 
the three simple interactions combined con- 
tribute an estimate of about one-third of the 
total variation. The interaction of subjects 
with modes of response (S X M-R = 17.08%) 
is probably exaggerated over the interaction 
of subjects with situations (S X Sit = 9.89%) 
and over the interaction of situations 
modes of response (Sit X M-R = 6.41%) by 
the manner in which the Illinois sample was 
selected. In this sample, the estimated 

ual accounts for about 30% of the sum of 
variance components. 


TABLE 2 


ESTIMATED VARIANCE COMPONENTS AND PERCENTAGES FOR EACH COMPONENT DERIVED FROM A THREE-WAY 
ANALYSIS oF VARIANCE (RANDOM-Errects MODEL) OF REPORTED RESPONSES TO SITUATIONS 


In THE S-R INVENTORY OF ANXIOUSNESS FROM 


Subject (S) ($) 

Situation (Sit) () 

Mode of Response (M-R) (j) 

S X Sit (pi) 

S X M-R (pj) 

Sit X M-R (ij) 

Residual (r) 

Total variation (Components sum) 


ILLINOIS, PENN STATE, AND YORK SAMPLES 


Penn State 
(N = 169) 


The last four columns of Table 2 present 
reanalysis of the Penn State sample and 
a similar analysis for the new York sample. 
While one of these samples is representative 
college students in a large state university 
d the other of students in a new, small 
madian college, it will be noted that the 
proportions of total variation from the main 
Sources and from the various interactions are 
Wery similar indeed, and that the degree to 
hich the various proportions differ from 
hose for the highly selected Illinois sample 
Table 2 is surprisingly small. 
For the Penn State and York samples, the 
mated proportions of the total variation 
ê very similar, and those from subjects 
‘(5.75% and 6.88%, respectively) are about 
‘qual. These proportions from subjects are 
Somewhat smaller, as would be expected, 
than that for the Illinois sample with the 70% 
Of moderately anxious students removed, In 
the Penn State and York samples, the esti- 
mated proportions of total variation from the 
Modes of response are increased to about one- 
fourth, and this increase over that for the 
Illinois sample would also be expected from 
the nature of the differences in the sampling 
Of subjects. Again the simple interactions 
bmbined contribute nearly a third of the to- 
variation in each sample. The interaction 
Subjects with Situations accounts for an 
€stimate of about 10%, that for Subjects with 
flodes of Response about 119%, and that for 
Situations with Modes of Response about 7% ; 
nd these proportions apply to both samples. 
fain the residual contributes just over a 
hird of the total variation in each sample. 


Discussion 


This reanalysis changes the answer to the 
testion of whether behavioral variation is 
"marily a function of individual differences 
of situations, It would appear that behav- 
variation is attributable to neither of 
factors per se. The fact that reliability 
nd validity coefficients are low for the inven- 
es purporting to measure various person- 
My traits (behavioral characteristics pre- 
ed to be consistent across situations) 
‘ould be expected from the fact that consid- 
Fably less than a tenth of the variance 
tives from individual differences. While sit- 
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y 

1934), when the individual is free to respond 
according to his own inclinations, as in the 
case of answering the S-R Inventory, the situ- 
ation per se makes no more contribution to 
the total variation than do individual differ- 
ences. Moreover, the interaction of subjects 
with situations contributes more of the vari- 
ance than does either by itself, While the 
modes of response continue to contribute 
nearly a quarter of the variance, they too 
show substantial interactions with both sub- 
jects and situations, and the triple interaction 
itself may be a substantial contributor (one 
may guess it to be of the order of 10%) even 
though it could not be assessed by the ap- 
proach used here because no subject responded 
to any item more than once. 

The fact that the proportions of total 
variability coming from the 
are found to be similar in 


a genuine generalizability of actual response 
across subjects than it reflects the generaliza- 
tion of reports of response. A culture may 
more readily standardize what people are will- 
ing and able to report than it standardizes 
actual variations in heart rate, in palmar 
sweat, and in tendencies toward nausea or 
loose bowels. Only by measuring actual re- 
sponses of individuals in such real situations 
as those here presented only verbally can one 
estimate how well reports reflect behavior. 
The proportion of the total variability from 
the sampling of situations in this study may 
well be considerably smaller than that to be 
found in life in general. This sampling is 
loaded with situations which were chosen to 
evoke substantial amounts of anxiety. Had 
the sampling included also such situations as 
“sitting down to eat a holiday dinner,” “sit- 
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ting down to read the evening paper,” “get- 
ting undressed for bed,” etc., the proportion 
of variance from situations might be higher. 
Investigations now in progress will attempt to 
answer this question by administering and 
analyzing reports from inventories with other 
samplings of situations. 

It is possible that the large proportion of 
variance coming from modes of response may 
be exaggerated here. Some of the responses 
called for are socially desirable and others 
are socially undesirable; this factor may 
have exaggerated the variance of reports 
more than actual behavior would vary. Again, 
it would be necessary to measure these various 
modes of response from samplings of subjects 
in real situations before such a question could 
be answered. 

The finding that nearly a third of the 
total variation comes from the simple inter- 
actions is important. The interaction of Sub- 
jects with Situations (about 10% in all three 
samples) indicates that while behavior is 
shaped by the situation, the shape it takes is 
not independent of the individual. Individuals 
respond more or less to various situations, 
independently of the mode of response called 
for. 

The interaction of Subjects with Modes of 
Response (about 11% of the variance) would 
imply that individuals vary substantially in 
the patterns of autonomic response to which 
they are prone, as Lacey and Lacey (1958) 
have noted and emphasized. 

The interaction of Situations with Modes 
of Response (of the order of 7% of the vari- 
ance) indicates that at least some of the 
situations must tend to induce certain modes 
of response somewhat consistently in people. 
The fact that this interaction contributes so 
little of the variance reminds one of the in- 
adequacy of that earliest hope for simple, 
consistent S-R laws. On the other hand, the 
fact that such a source of variance does exist 
is consonant with the finding that certain 
kinds of stimuli do evoke specific kinds of 
autonomic responses fairly consistently (see 
Darrow, 1929; Davis & Buchwald, 1957; Da- 
vid, Buchwald, & Frankmann, 1955; Lacey & 
Lacey, 1958). 

The triple interaction is probably meaning- 
ful psychologically even though it was not pos- 
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sible to isolate it in this study. The trig 
interaction states that in a particular si 

tion, a particular person has a particular 
of response. For example, a given indi 
may not need to urinate frequently, nor need 
he be generally anxious about automobiles ¢ 
about riding in them, but when he is 
to start off on a long automobile trip, he may 
find himself needing to urinate frequently, 
Such triple interactions may be quite real, 
and they may differ in important ways from 
main effects, from simple interactions, and 
from error. Unfortunately, when each person 
reports each of his responses in each situation i 
only once, it is not possible to differentiate 

estimates of the triple interaction from esti- 
mates of the error. It is not easy to obtain 
repeated realistic reports of specific respo 
in each of the situations because subje 
may become irritated by the repetition or they 
may merely repeat their first report of re 
sponse from memory. On the other hand, it 
should be quite feasible to record actual 
sponses of individuals in several encounter 
with each of the various situations. Uni 

such investigative circumstances, the tri 
interaction could readily be separated from 
the error or within-variance. 


Choice of Mathematical Model 


included in each mean square, to estimate Hie 
magnitude of each source from each sample of 
data, and to compare the magnitude of es ch 
source with the sum of all sources of varia 
in the analysis, An alternative approach 
the partitioning of variance consists of com 
puting “coefficients of utility” by form 
ratios of the sum of squares for each s0 
of variance to the total sum of squares 
the analysis (Bolles & Messick, 1958). Ak 
though coefficients of utility are simple to 
compute and are often empirically similar © 
estimated components of variance, such 
approach is not advocated because it S$ 
from at least two defects. 

First, the ratios of the sum of squares for 
the various sources to the total sum of square 
are the same irrespective of whether a fixed 
effects, a mixed-effects, or a random-effects 
model is assumed. Thus, comparing ratios 


ain er ns 
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the various sums of squares disregards what 
has been learned about statistical models dur- 
ing the past 25 years, while the approach by 
way of components of variance is sensitive to 
the various models. Empirical discrepancies 
between the two approaches may be small, 
but the results obtained with the two models 
do differ. Gaito (1958) presented a set of 
data where the results and conclusions from 
the two models differed radically. 

Second, comparing the ratios of the various 
sums of squares with the total sum eliminates 
the possibility of removing the effects of in- 
significant sources of variance. In the ap- 
proach by way of components of variance, one 
can pool the mean squares for nonsignificant 
sources. Thus, with the approach by way of 
comparing ratios for the sums of squares from 
the various sources, one throws away infor- 
mation by failing to make use of the tests of 
the null hypothesis, This can make a substan- 
tial difference in results. On these considera- 
tions, Gaito (1958) suggested in his critique 
of the Bolles-Messick coefficient of utility 
that for both ease of interpretation and de- 
finitiveness of results, the approach by way 
of components of variance is to be preferred 
to that by way of sums of squares. 


Tucker’s Three-Mode Factor Analysis 


A highly appropriate alternative treatment 
of such data would be Tucker’s (1964) three- 
mode factor analysis if the existing storage 
limitations of computers did not make com- 
putation so difficult. The highly significant 
interactions in the analysis of variance indi- 
cate that situations and modes of response are 
neither single factored nor independent of one 
another. In their simple factor analyses of 
the situations and of the modes of response, 
Endler, Hunt, and Rosenstein (1962) found 
three types of situations and three types of 
Modes of response, Levin’s (1965) three-mode 
factor analysis of the data from the Penn 
State sample included here confirms these 
findings, at least in general, and indicates sig- 
nificant interactions, His core matrix consists 
Of three idealized subjects or types. Each type 
'S an interaction matrix of Situation Factors 
X Response Factors; that is, for each type, 
We obtain the scores of response factors to 
the situation factors (see Levin, 1965, p. 451). 


343 


Since Levin’s results from a three-mode fac- 
tor analysis are highly consistent with our 
own, they provide a confirmation of the 
method of estimating the size of components 
of variance used in this study. 


Suggestions for Personality Assessment 


The fact that very substantial portions of 
the total variance come from the interactions 
of subjects with situations, of subjects with 
modes of response, and of situations with 
modes of response, and from the triple inter- 
action, has importance for personality descrip- 
tion and for the design of inventories to pre- 
dict either behavior or feelings. First, it im- 
plies that accuracy of personality description 
in general calls for statements about the 
modes of response that individuals manifest 
in various kinds of situations as well as state- 
ments about their general proneness to make 
certain responses rather than others, and about 
their proneness to be responsive rather than 
unresponsive. Such a direction in personality 
description would have its ultimate limit in 
the uniqueness of the individual emphasized 
by G. W. Allport (1937, 1962). Like many 
limits, uniqueness in generalized description 
of persons may be unfeasible, but considera- 
ble refinement might be achieved with ap- 
preciable profit by categorizing situations and 
by categorizing modes of response and, then, 
by describing individuals in terms of those 
categories of responses they are prone to 
make in various categories of situations. The 
format of the S-R Inventory permits sepa- 
ration of the situations to be symbolically 
encountered by the subjects from the re- 
sponses they are asked to report. It also per- 
mits sampling of situations and of responses 
separately. Considerable improvement in the 
validity of trait assessment might well be ob- 
tained from the use of this format with the 
samples of situations and modes of response 
designed for the purpose of assessment. 

Second, the fact that substantial portions 
of variance come from the interactions sug- 
gests that the validity of predictions of per- 
sonal behavior should be substantially im- 
proved by asking the individuals concerned 
to report the trait-indicating responses of in- 
terest in the specific situations, or at least in 
the specific kinds of situations, concerned. 
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Evidence that use of the format of the S-R 
Inventory in such a fashion does improve 
both the precision of description and the va- 
lidity of prediction, at least so far as anxiety 
is concerned, has recently come from several 
sources, 

On the side of precision of description, End- 
ler and Bain (1967) have employed the origi- 
nal form of the S-R Inventory of Anxiousness 
to study social class differences and have 
found college students from the lower class 
more anxious than those from the middle 
class in situations loaded heavily on the inter- 
personal factor but not in situations offering 
inanimate, physical danger or in psychological 
situations, Concretely, the class differences in 
reported responses were limited to such situ- 
ations as “You are going into an interview 
for an important job” or “You are entering a 
competitive contest before spectators”; they 
did not appear for such situations as “You 
are crawling along a ledge high on a mountain 
side” or “You are going as a subject into a 
psychological experiment.” Moreover, in a 
comparison of high school boys from lower 
and higher socioeconomic backgrounds on the 
various categories of responses in the S-R 
Inventory, Haywood and Dobbs (1964) found 
those of lower socioeconomic status scoring 
higher than those of higher socioeconomic 
status on those responses loading heavily on 
either the autonomic factor or the exhilaration 
factor. These studies suggest that the separa- 
tion of categories of situations from categories 
of responses afforded by the S-R Inventory 
permits an improvement in the precision with 
which class differences in anxiousness can be 
described. 

On the side of validity of prediction, two 
studies have yielded interesting evidence. In 
a study of the effects of three therapy-like 
approaches to the modification of frequency 
of recitation during quiz sections in a course 
in elementary psychology, D’Zurilla (1964) 
has related reported degrees of response in 
the various kinds of situations in the S-R In- 
ventory to the number of times each student 
actually talked out in class before the treat- 
ments were instituted. The anxiety scores for 
the situation described as “You are getting up 
to give a speech before a large group” showed 
a correlation of —.63 with the numbers of 
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pretreatment recitations. Significant 
coefficients were found between anxi 
for other interpersonal situations 
criterion, but the correlations becan 
insignificant for anxiety scores for other 
of situations and for global measures of 
ety. In a study comparing the effi 0 
sight therapy and desensitization in ret 
anxiety in public speaking situations, 
(1966) found anxiety scores based on thi 
anxiety-indicating modes of response in 
S-R Inventory to the public speaking 
tion to show validity coefficients of the: 
of .7 to .8 with subjects’ reports of anx 
when actually speaking. Such anxiety 
also showed significant but lower correlati 
with rater observations of anxiety from | 
jects’ speaking behavior. Anxiety scores ba 
on subjects’ responses to three other ii 
personal situations in the Inventory sh 
significantly lower correlations with re 
and observed anxiety of subjects in the 
ing situation. These results suggest si 
that validity of prediction is a function 
similarity of the test situations to which si 
jects are asked to report their resp 
the criterion situation. The format of th 
Inventory permits specifying test s i 
that are much like any criterion si uation 
which one may wish to predict subjects” 
sponses, 


CONCLUSION 


In the light of this reanalysis of the 
from the study by Endler, Hunt, and 
stein (1962) and of the analysis from 
additional sample from York University, 
question of whether individual difference 
situations are the major source of beh 
variance, like many issues in the h 
science, turns out to be a pseudoissue. 
effect, there is no single major source 0} 
havioral variance, at least so far as the t 
of anxiousness is concerned. Human 2 
is complex. In order to describe it, one ? 
take into account not only the main sou! 
of variance (subjects, situations, and : 
of response) but also the various simple 
teractions (Subjects with Situations, Sub) 
with Modes of Response, and Situations ' 
Modes of Response) and, where feasible 
triple interaction (Subject with Situat 
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with Modes of Response). Behavior is a 
function of all of these factors in combina- 
tion. 

The marked similarity for the proportions 
of total variation (sum of variance compon- 
ents from the various main sources and inter- 
actions) across the three samples of subjects 
in this study indicates considerable consist- 
ency, at least among college-age youth, for 
this sampling of situations and this sampling 
of modes of response, Work now in progress 
is attempting to ascertain whether this gen- 
erality holds for other samples of situations 
and other samples of modes of response. 

The fact that the interactions contribute 
approximately a third of the variance implies 
that personality description can be improved 
by describing people in terms of the kinds of 
response they manifest in various kinds of 
situations. Studies are cited in which this 
implication is empirically confirmed. 
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and central factors respondble for 


peripheral 
4 electrical properties of palmar skin: (a) skin-resistance level (SRL), (b) 
skin-resistance responses (SRRs), (c) skin-potential level (SPL), and (d) skis- 
potential responses (SPRs), these latter being often diphasic—an initial negative 


change in potential followed by a positive wave. There seems lithe doubt 
SRL and SRRs are closely linked with sweat-gland activity, bet, in 
there is probably some contribution from 


of 
he epidermis, In the case of SPR, 


suggest that SPL is largely 


independent 
to certain membrane characteristics of t 
latency of the negative wave seems to correlate 
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Available 
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SRR, and both are probably functions of the presecretory activity of sweat 
glands. The mechanism of the positive wave is in doubt; it is regarded by some 
as a secondary aspect of sweat-gland activity and by others as being of inde- 


pendent epidermal origin. 


There is a great deal of indirect evidence 
relating skin-resistance level (SRL) and skin- 
resistance responses (SRRs) to sweat-gland 
activity. Thus the distribution of sweat glands 
over the body surface coincides with the 
distribution of SRL and the frequency of 
SRRs (Kuno, 1934; Leva, 1913), the palms 
and soles having the densest concentration, 
lowest resistance, and most readily elicited 
SRRs. Peripheral nerve section and sympa- 
thetic ganglionectomy are followed by high 
SRL and an absence of SRRs (Richter, 1927; 
Richter & Woodruff, 1941), and subjects with 
Congenital absence of sweat glands show no 
SRRs (Wagner, 1952). 

There are fewer data on the role of sweat- 
gland activity either in skin-potential level 
(SPL) or skin-potential responses (SPRs), 
the latter being frequently biphasic in form: 
an initial negative component followed by a 
Positive wave. The available evidence has 
been very largely derived from correlations 
between potential and resistance measures; 
this suggests a close correspondence between 
Potential and resistance responses but a rather 
slight (though sometimes statistically signifi- 
cant) relationship between potential and re- 
Sistance levels. To argue that because sweat- 
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gland activity is related to resistance meas- 
ures, and 


& Downman, 1964). Any apparent effect m 
well be accounted for in terms of changes in 
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PERIPHERAL FACTORS 
Role of Sweat Glands 


Anatomically, the eccrine sweat glands con- 
sist of a coiled secretory portion and a simple 
tubular duct. Although innervated by sympa- 
thetic fibres, the transmitting agent is ace- 
tylcholine. Some of the most convincing evi- 
dence for the sweat-gland hypothesis has been 
provided by pharmacological studies. Agents 
such as atropine prevent acetylcholine from 
exciting the sweat glands, and when introduced 
by iontophoresis the result is an increased skin 
resistance and elimination of SRRs (Lader & 
Montagu, 1962; Martin & Venables, 1964; 
Wilcott, 1964). In addition, both negative 
and positive waves of the skin-potential re- 
sponse are eliminated (Martin & Venables, 
1964). The amount of current required to 
eliminate sweat-gland activity by this means 
can be shown to vary from subject to subject 
and it was undoubtedly incomplete atropini- 
zation which gave rise to earlier discrepant 
findings in this area. 

In spite of the close relationship between 
palmar skin resistance and sweat-gland ac- 
tivity, Darrow suggested in 1927 that skin- 
resistance responses do not depend primarily 
upon sweat secretion since they regularly 
occur about 1 second prior to its visible 
appearance. Using more sophisticated and 
reliable apparatus for determining sweating, 
Wilcott (1962) obtained a similar figure. Evi- 
dence derived from infrahuman subjects also 
confirms this finding with respect to the nega- 
tive wave of the skin-potential response 
(Lloyd, 1959b; Patton, 1948; Shaver, Brusi- 
low, & Cooke, 1962). Thus, a number of stud- 
ies all agree that the electrical response re- 
liably precedes the appearance of sweat at the 
skin surface. 

Attempts to establish more precise corre- 
lates between the electrical activity of the 
skin and measures of sweat-gland activity 
will next be considered separately for skin- 
resistance and skin-potential measures. 


Sweat-Gland Activity and Skin Resistance 


Sweat-gland counts. A number of tech- 
niques have been employed, but all are re- 
stricted to spot counts, that is, they are not 
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continuous. Correlations are therefore usu- 
ally obtained with skin-resistance levels, and 
not responses. 

In discussing the results of these studies, it 
seems important to observe the distinction be- 
tween within-subject and between-subject 
correlations, since there is a tendency for the 
magnitudes of such correlations to differ. In 
general, the within-subject correlations are 
higher: Thomas and Korr (1957), using the 
Netsky prism technique, reported correlations 
ranging from .440 to .960 for conductance 
level and gland count, and Martin and Vena- 
bles (1964), using a plastic ink technique 
(Sutarman & Thomson, 1952), reported a 
correlation of .40, p < .05. Between-subject 
correlations range from .31 (Wenger & Gil- 
christ, 1948) to .28-.47 for Caucasians; 
.65—.79 for Negroes (Johnson & Landon, 
1965). 

Continuous measurement of sweating. Ob- 
viously, the precise examination of relation- 
ships is better achieved with continuous 
methods of recording sweat activity, and a 
number of accurate techniques are available 
for the measurement of evaporative water 
loss (Adams, Funkhouser, & Kendall, 1963). 
This measurement includes water loss from 
both sweat glands and epidermis. Again, in 
evaluating studies which used these tech- 
niques, it seems useful to distinguish the 
within-subject from between-subject data. 
Adams and Vaughan (1965) gave within- 
subject correlations for conductance am 
sweating which ranged from .79 to 92. Wil- 
cott’s between-subject result for resistance an 
sweating, using similar but not identical 
techniques, was — .22 (ms). These within- 
and between-subject findings closely support 
those given above for sweat-gland counts. , 

There is, however, a large discrepancy m 
the reported within-subject correlations i 
tween SRR amplitude and sweating. A 
Adams and Vaughan (1965) and Wilcot 
(1962) agree on a very close relationship (0 
the order .8 and .9), Edelberg’s (1964) dala 
showed that only 4 out of 12 subjects gawea 
significant relationship; the remaining ag 


5 s ay p b 
correlations were not statistically significan 


The sources of the discrepancy are i 


identify. Edelberg discussed certain 
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ences in technique, but tentatively postulated 
that an epidermal factor contributes to SR 
measures independently of sweat-gland activ- 
ity. 

Before considering non-sweat-gland hy- 
potheses, however, there is evidence concern- 
ing other aspects of sweat-gland activity 
which could affect SR measures. 

It is possible, for example, that the chemi- 
cal constitution of surface sweat may be a 
contributory factor. Johnson and Landon 
(1965) suggested that the higher correlation 
between skin-conductance level and sweat- 
gland counts found in Negroes compared with 
Caucasians might be attributable to a higher 
electrolyte concentration in the sweat of Ne- 
groes. Further evidence along these lines is 
needed, Adams and Vaughan (1965) also 
demonstrated that the obtained correlations 
(within-subject) for sweating and skin re- 
sistance are very dependent upon the phase 
of sudomotor activity, that is, whether it is 
increasing or decreasing. They suggested that 
a sustained low electrical resistance following 
substantial sweating may be due to water 
remaining in the stratum corneum. 

Relatively little is known on the mecha- 
nisms of reabsorption and how these might 
affect measures of electrical activity. Roth- 
man (1954) suggested that sweat may be 
absorbed by the horny layer, and Lloyd 
(1959a, 1959b, 1960) has argued that reab- 
sorption can occur in the sweat duct. From 
Lloyd’s figures, it would appear that the rate 
of reabsorption is too slow to provide an ex- 
planation of the recovery phase of the re- 
sistance response, as suggested by Darrow 
(1934), but rather that it may be a partial 
determinant of skin-resistance level. 

It is not difficult to point out the limitation 
of the sweat-gland techniques used. Counts of 
active sweat glands provide no information 
on partially filled ducts: evaporative water- 
loss techniques record overall quantity of 
secreted moisture whether from sweat or 
epidermis. A total picture of all aspects of 
Sweat-gland activity should ideally provide 
data on a sequence of phases: secretory cell 
activity, sweat formation and composition, 
duct filling, visible sweat emergence, and re- 
absorption of sweat either at the base of the 
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duct (Lloyd, 1959b) or by the horny layer 
(Rothman, 1954). Until these aspects can 
be quantified and related to skin resistance 
measures, it is impossible to assess the limits 
of the sweat-gland contribution to skin re- 
sistance. 

The overall weight of evidence suggests at 
least a sizeable (and statistically significant) 
correlation between quite different methods 
of measuring sweat-gland activity and skin 
resistance. These findings show higher within- 
subject than between-subject correlations, and 
this suggests that the amount of activity of 
the sweat secretory cells which gives rise to 
measured electrical changes produces different 
amounts of sweat in different individuals. The 
variation in the “power” of different sub- 
ject’s sweat glands may be greater than the 
variation within individuals, which is about 
20 to 1 for man (Dole & Thaysen, 1953) and 
for the cat (Lloyd, 1959a). 


Sweat-Gland Activity and Skin Potential 


In this instance, the within- and between- 
subject data agree; neither Wilcott’s (1962) 
between-subject correlation of .29 for sweat- 
ing and SPL, nor Venables and Martin’s 
(1966) within-subject correlation (.04) is 
significant. These nonsignificant findings are 
in apparent contrast with the generally sig- 
nificant correlations reported (between-sub- 
jects) on skin-potential and skin-resistance 
levels. Grings (1953), for example, obtained 
an eta of .59, and Venables and Sayer (1963), 
in two separate studies, obtained correlations 
of .68 and .51 (using conductance measures). 
Venables and Martin (1966), however, ob- 
tained a nonsignificant correlation on within- 
subject data (r = —.12 between conductance 
and potential). 

The apparent contradictions in these find- 
ings may perhaps be resolved if we accept 
that the direct evidence on SPL and sweating 
suggests no relationship, a conclusion further 
supported by Venables and Martin’s (1966) 
finding of little effect of atropine on SPL. The 
indirect evidence from significant correlations 
between SPL and SRL may arise from a com- 
mon factor which is not sweat-gland activity 
but which might be internal electrolyte con- 
centration or nonsudorific epidermal charac- 
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teristics. If, for instance, the balance of body 
electrolytes could produce a level of skin po- 
tential and tissue resistance which varied 
from person to person, a correlation between 
potential and resistance would be expected 
over a group of people but not within a single 
subject. 

The mechanisms underlying the skin-poten- 
tial response have also been the subject of 
considerable controversy. Wilcott (1962) has 
reported extremely high correlations between 
skin sweating and SPRs, ranging from .62 to 
80 (within subjects) for the negative wave 
of the SPR, and from .81 to .95 for the posi- 
tive wave. So far as indirect evidence is con- 
cerned, Wilcott (1965) also reported high 
correlations between SRR and SPR ampli- 
tudes (ranging from .77 to .84, data ob- 
tained from cats). Similar correlations were 
earlier reported for humans (Wilcott, 1958a), 
but somewhat lower and more variable cor- 
relations have been found by Burstein, Fenz, 
Bergeron, and Epstein (1965). 

On the basis of this evidence, together with 
data showing that the latencies of the SPR 
and SRR are the same (Goadby & Goadby, 
1936; Wilcott, 1958a), and that both SPR 
and SRR are eliminated by atropine (Martin 
& Venables, 1964), it would seem reasonable 
to conclude that both negative and positive 
phases of the SPR are functions of sweat- 
gland activity. 

A number of hypotheses have been put for- 
ward concerning the more precise mechanism 
underlying the positive wave of the SPR. 
Lloyd (1961) termed the initial negative po- 
tential response the “action potential” of the 
sweat glands, and the later positive com- 
ponent the “secretory potential” which he 
suggested is due to duct filling. Further evi- 
dence for the latter viewpoint is provided by 
Wilcott (1962) who showed that the latency 
of the positive response of the SPR is similar 
to the latency of the overt sweating response. 
It is tempting to relate the positive phase to 
some aspect of duct filling (cf. Darrow, 1964), 
but other interpretations have been put for- 
ward, 

It has been argued by Trehub, Tucker, and 
Cazavelan (1962) that while the negative 
wave is an active process depending upon 
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neural innervation of sweat gland t 
tive wave is a passive process resultin 
the breakdown of a polarized membr 
a surge of surface-negative ionic | 
follows from this view that the probab 
occurrence of positive waves would in 
as the basal potential of the skin shift 
direction of increased surface negati 

cott (1965) has produced some coni 
evidence that diphasic responses ai 
likely to occur with a relatively high 
tive basal potential. 

From an experimental point of 
would be useful if basal skin-potential | 
could be manipulated to check on # 
ence or absence of diphasic 
fortunately, those attempts which have b 
made to shift SPL by artificial mean 
example, by the insertion of a volt 
tween the measuring electrodes; Wilk 
1964) must be interpreted with reserve, 
the identification of the positive wave ul 
artificial conditions may be in dou 
willo, 1965). 

Other views dispense altogether wi 
gland activity as the basis of the 
SPR component. Edelberg (1963, 1 
gued that the positive SPR is a pul 
dermal response from the fact that 
tive and positive components react di 
for example, to temperature change 
hypoxia. If this is so, it must be che 
cally mediated since there seems to be 
doubt that iontophoresis of atropine or hi 
scyamine completely abolishes both neg 
and positive waves of the SPR ( 
Venables, 1964; Wilcott, 1964). It is 
to imagine what this mechanism might 
Edelberg offers no suggestions on this 

In this connection, the data of 
Brusilow, and Cooke (1962) are often 
these authors observed a positive pot 
change from the cat’s epithelium which | 
dependent of sweat-gland activity. It s 
be emphasized, however, that this epitl 
response is of very long latency and dur 
(i.e., of the order of minutes) and prob 
should not be compared with the human 
positive wave which is much faster. 

To summarize, it seems likely that the 
and the negative component of the SPR have 
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a similar physiological basis in sweat-gland 
activity. Data on the positive component are 
not conclusive. As they stand, they leave open 
the possibility that sweat-gland activity is a 
contributory factor, either actively as a con- 
sequence of duct filling, or as a passive 
equilibrating phenomenon, On the other hand, 
Edelberg has put forward arguments to sup- 
port a non-sweat-gland, epidermal hypothesis. 
Our view is that some of the points raised by 
Edelberg (1964) do not really demolish the 
sweat-gland theory; for example, the fact 
that the positive and not the negative com- 
ponent is abolished by exsanguination could 
arise if the membranes responsible for nega- 
tive and positive waves had different thresh- 
olds to the resultant oxygen deprivation. A 
similar argument could be applied to the 
selective effects of temperature. Several lines 
of evidence make this a tenable hypothesis, 
and are discussed later. 


Limits of the Sweat-Gland Hypothesis 


While there is thus adequate evidence that 
sweat-gland activity contributes to SRR, SPR, 
and SRL, in the case of the latter, at least, 
other factors are demonstrably important. 
Even after atropinization, skin resistance is 
still measurable (Lader & Montagu, 1962), 
and Thomas and Korr (1957), on the basis of 
estimation from a regression line relating con- 
ductance and number of sweat glands, sug- 
gested values of from 3 to 10 megohms for the 
resistance of the nonsudorific element remain- 
ing when no sweat glands are active. The im- 
plication is that there are at least two factors 
which contribute to the electrical character- 
istics of the skin: the sweat glands and a 
nonsudorific element which Wilcott (1962) 
suggested is a function of the structural char- 
acteristics of the epidermis. It is likely that 
the mechanisms involved in both sudorific and 
nonsudorific cases are membranes having 
properties which are so complex that interpre- 
tation of their function has not yet been 
Made clear, 


Membrane Characteristics of the Skin 


Biological membranes may be included in 
a class of “ionic membranes” among whose 
characteristics listed by Teorell (1953) are 
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resistance at a membrane are brought about 


of the skin) are again a function of the char- 
acteristics of the membrane and also of the 
concentration of internal electrolytes in rela- 


tion to the characteristics of the external 
electrolyte. It can be shown that a super- 


ficially located epidermal membrane generally 
behaves as though it possesses a fixed nega- 
tive charge: for instance, it is permeable 

i leave 


excess of negatively charged 
electrolyte external to the membrane, 


an abraded palmar site, where ions can move 
with equal velocity in both directions. 

The most direct experimental evidence for 
this view is from Rein (1924), who found 


that electro-osmosis through an isolated piece 
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of epidermal tissue was of such a direction 
that it behaved as though it had an external 
negative charge. He showed that while basic 
cationic dyes such as methylene blue will 
penetrate some way into the epidermis, acid 
(anionic) dyes which have a negative charge 
will not. Additional findings of his indicated 
that the surface-negative charge on the skin 
shown by electro-osmosis through a portion 
of isolated epidermal tissue was diminished 
by the introduction of certain salts, and that 
the effect was in the order of valency of 
their cations, Kt < Nat < Ca" < Al***. Alu- 
minum ions were stated to be particularly 
effective in reducing the negative charge on 
the membrane. 

Edelberg’s (1963) data on different electro- 
lytes are in accord with this evidence. He has 
shown an effect of type of cation on resting 
potential which he suggested is due to steric 
effects resulting from cation size. Potentials 
with the small cations Na* and Li* showed 
little mean effect; large cations such as Ca** 
and Al*** bring about a marked diminution of 
negative skin potential. These results provide 
additional support for the idea of accessibility 
of a surface-negative epidermal membrane. 
Further evidence comes from concentration 
effects. It has been shown that the concentra- 
tion of the external electrolyte used affects 
the apparent resistance of the skin, and simi- 
lar effects on skin potential have been shown 
by Edelberg (1963) and Venables and Sayer 
(1963). 

Concerning the more precise identification 
of the epidermal membrane responsible for 
the effects described above, Rothman (1954) 
reviewed the evidence and concluded that the 
effective boundary must lie between the 
cornified and noncornified layers of the epi- 
dermis, but he was not willing to limit the site 
specifically to the stratum lucidum as sug- 
gested by Rein (1924). Suggestive support- 
ing evidence for this membrane, is, however, 
provided by its topographical distribution. 
Montagna (1962) stated that the stratum 
lucidum is only found where the epidermis is 
very thick, while Rothman (1954) stated 
that it is confined almost exclusively to 
palmar and plantar surfaces. It is in these 
areas that skin-potential levels are highest, 
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potential dropping markedly, for instance, 
between the palm and forearm. Whatever 
structure is finally identified as being the 
relevant functioning membrane, the evidence 
for the existence of such a surface-negative 
membrane seems quite clear. 

It is customary in biological systems for a 
surface-negative membrane to show a posi- 
tive-going response on depolarization, and 
conversely for a surface-positive membrane 
to show a negative-going response, In the 
present context, there are several findings 
consistent with this expectation. Shaver et al. 
(1962) obtained (a) a positive response from 
the epidermis which has an apparent, general, 
surface-negativity and (b) an initial negative 
SPR component from the secretory cell mem- 
branes of the sweat glands possessing a stand- 
ing surface-positive charge. (Other evidence 
indicates that positivity is a common char- 
acteristic of secretory membranes.) Yet, in 
spite of high negative potentials occurring at 
the surface of the skin, the initial component 
of the SPR is, as stated, characteristically 
negative. Edelberg (1963) suggested that 
measured SPL is in fact an algebraic sum of 
an outer surface-negative epidermal mem- 
brane and an inner surface-positive potential 
across the secretory membranes of the sweat 
glands, the net result under normal measur- 
ing conditions being negative. 

This model is in accord with a number of 
findings. It should be remembered that 
whereas an outer surface-negative epidermal 
membrane would be easily accessible to super 
ficially applied electrolytes (Rein, 1929), the 
positive membrane at the base of the sweat 
glands would not be so easily reached since 
the ducts of the glands do not normally pro- 
vide a route for absorption of external elec- 
trolytes (Rothman, 1954). It is therefore 
understandable that varying the concentration 
of external electrolytes affects potential level 
(assuming this to be mediated by an outer 
accessible membrane), but effects on response 
measures (assuming these to be mediated by 
the deeper sweat-gland membranes) wol 
not be expected. 

In contradiction to this expectation, data by 
Edelberg, Greiner, and Burch (1960) showed 
an effect from various electrolytes (K, Na, Li, 
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etc.) on both resistance and potential re- 
sponses. Our explanation of this discrepancy 
is that if resistance and potential experiments 
were being carried out at the same time on 
the same skin sites, the passage of current 
required for the measurement of resistance 
provided the electrophoretic drive to intro- 
duce the external electrolyte down the sweat 
ducts and to the secretory membrane, In the 
absence of this applied current, the secretory 
membrane would remain inaccessible. Some 
support for this view may be derived from a 
later experiment by Edelberg (1963), involv- 
ing only measurements of skin potential, 
which showed that neither the negative nor 
the positive component of SPR was affected 
by changes in concentration of NaCl, al- 
though skin-potential Zevel was affected. 

To reiterate and slightly expand this view, 
if SRRs are mediated by the deeper sweat- 
gland membranes and not by an outer epi- 
dermal membrane, an effect from external 
electrolytes should be brought about only by 
the electrophoretic driving of cations to that 
membrane. The validity of the hypothesis of 
accessibility by electrophoretic driving by an 
imposed current is supported by the data of 
Edelberg et al. (1960). These are concerned 
with the effect of the polarity of the palmar 
electrode required to drive either anions or 
cations to the skin, and the size of cation or 
anion contained in the electrolyte. Small ca- 
tions such as K* and small anions such as Cl 
would not be expected to have much effect, 
and this was shown by Edelberg et al. to be 
the case. Larger anions such as SO, have a 
slightly more marked effect, but the greatest 
effect results from large cations such as Ca** 
and Al***, which is to be expected on the 
basis of the effects on membrane function 
reported earlier. For skin-resistance responses, 
Edelberg et al. expressed the polarity changes 
as ratios; the ratio of SRR amplitude with 
anodal current to SRR with cathodal current 
is 1.04 for KCl, 1.58 for CaCl and 2.04 for 
AlCl;. 

These findings are in agreement with Edel- 
berg’s (1963) two-membrane hypothesis. By 
extension of the SRR data to SPRs, if the 
Positive component of SPR were also medi- 
ated by deep sweat-gland membranes, this 
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CENTRAL Factors 


Experiments concerning the role of central 
mechanisms in skin-resistance and skin-po- 
tential measurements have been usually car- 
ried out with cats as subjects, with experi- 
mental lesions placed at various stages in 
the CNS. Fewer data are available on human 
subjects, and these will be considered first. 


Human Studies 


The problems arising from this sort of study 
are self-evident and were discussed by Sourek 
(1965). This author has carried out an ex- 
tensive series of studies on normal subjects 
and on patients with a wide range of CNS 
lesions, both traumatic and surgical. In all 
cases, the measure used was the SPR. The re- 
sults of lesions of peripheral nerves, sympa- 
thectomies of various types, cordotomy, etc., 
showed clearcut effects on the SPR (both neg- 
ative and positive phases). Thus the response 
disappears following severance of peripheral 
nerves, the lumbar or thoracic sympathicus, 
and the cervical and thoracic spinal cord. Re- 
sections of different parts of the cerebral 
hemispheres, however, produced no major 
changes in the SPR. As far as can be judged 
from the records presented by Sourek, nega- 
tive and positive waves of the SPR are simi- 
larly affected by the various lesions; that is 
to say, no differential effects are apparent. 

There are several difficulties in interpreting 
the SPR data following lesions since skin tem- 
perature is frequently altered and Sourek feels 
that changes in SPR and skin temperature are 
correlated. The present authors have also 
noted (Venables & Martin, 1966) marked 
differences in skin temperature following 
sympathectomy. In addition, patients whose 
sympathectomies were performed for the re- 
lief of Raynaud’s phenomenon often had 
marked atrophic changes of the skin. 
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Animal Studies 


It must be emphasized that the conditions 
under which experiments of this kind are exe- 
cuted are far from those of the normally func- 
tioning animal. The animal is frequently 
anesthetized and may have acute or chronic 
lesions of the central neuraxis. Stimulation is 
of a relatively artificial kind, for example, 
electrical stimulation of the lumbar sympa- 
thetic trunk, of various CNS regions, and 
cooling or injecting local anesthetics. 

Stimulation of an exposed lumbar sympa- 
thetic trunk obviously bypasses most of the 
functions of the CNS and is a highly artificial 
way of evoking a response from the sweat 
glands. So, too, is the stimulation of a bared 
cutaneous nerve which will also excite all the 
afferent fibres in the nerve trunk as well as 
those afferent pain fibers which provide the 
stimulus for the skin-potential response. 

Wang (1957, 1958) and others have con- 
sistently reported that the response recorded 
(i.e., potential response) is always mono- 
phasic, and they have considered the effects 
of their experimental manipulations in terms 
of whether the response is present or absent, 
and, if present, whether the amplitude of the 
response is changed. (Obviously the observed 
latency is largely a function of the site of 
application of the stimulus, i.e., whether to a 
peripheral nerve or to cortical regions.) The 
results of these studies have revealed a num- 
ber of excitatory and inhibitory influences on 
the skin-potential response arising from the 
central neuraxis, as well as intraspinally, and 
Wang has stressed that the response is the 
result of the algebraic summation of all these 
determining factors. 

There are several reservations which must 
be made concerning the applications of Wang’s 
data to interpretation of psychological experi- 
ments. In the first place, they deal entirely 
with the skin potential response. This means 
that the findings can, to a certain extent, be 
applied to skin resistance responses insofar as 
the two appear to be highly correlated in 
terms of frequency of occurrence and ampli- 
tude, the latter being the measure Wang has 
primarily been concerned with. The only rele- 
vant data available on the correlation of am- 
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plitudes of SRRs and SPRs seem to be Wik 
cott’s (1965) correlations which range from 
.77 to .84. These correlations are obviously 
very high, but figures for human subjects are 
not so consistent. Wilcott (1958) reported 
high correlations, but the within-subject core 
relations found by Burstein, Fenz, Bergeron, = 
and Epstein (1965) were substantially lower 
and more variable. 

Second, Wang’s data do not refer to 
standing levels of either potential or resist- 
ance, and his conclusion, for example, that 
the brain-stem reticular system plays a most 
important role in the GSR, applies strictly to 
the potential response only although future — 
data may extend this finding. 

A third and more puzzling factor is Wang's 
explicit statement that “a biphasic galvani¢ 
skin reflex has never been encountered.” This 
was slightly expanded in the statement of 
Richter and Whelan (1943) that whereas 
electric shocks applied to the cut surface of 
the spinal cord and brain stem produced only 
monophasic responses, biphasic or multiphasi¢ 
responses were produced on stimulation of 
the intact cortex. These findings cannot be 
wholly attributed to the use of ac amplifiers 
with short time constants which might distort 
the waveform. i 

Isamat’s (1961) records seem to provide 
evidence of biphasic responses following stimu- 
lation of the medial wall of the right cerebral 
hemisphere (limbic cortex), although the au- 
thor did not specifically discuss this. Shaver 
et al. (1962) also appear to have obtained 
a biphasic response to repetitive stimulation 
of the lumbar sympathetic trunk. Wilcott 
(1965) has explicitly stated that, in the wak- 
ing cat, SPRs were always uniphasic nega: 
tive waves, while in the anesthetized anima 
and following repetitive nerve stimulation, di- 
phasic waves were observed. a 

Tf, as was suggested earlier, the positive 
phase of the SPR relates solely to duct fillings 
it is difficult to see why it should be absent 
any of these cat preparations. Wilcott (1965) 
suggested that the waveform may relate to skin 
potential level such that diphasic SPRs occur 
only with a relatively high, negative b 
potential. This view leads to two lines 0 
thought: one is that there may be species dif- 
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ferences in the levels of standing skin poten- 
tials; the second, that standing level may 
relate to a factor of behavioral arousal. Lied- 
erman and Shapiro (1964), for example, sug- 
gest that a high negative SPL is found in 
highly aroused subjects; and Yokota, Taka- 
hashi, Kondo, and Fujimori (1959) observed 
that the polyphasic waveform occurs more fre- 
quently in response to intense stimuli. 

Thus the positive response may represent a 
secondary phase of sweat-gland activity which 
may be determined in part by the nature of 
the stimulus. Wang and Brown (1957), in 
their discussion of terminal rebound of the 
GSR, pointed out that their stimulus pro- 
duced an inhibitory as well as an excitatory 
effect on the GSR. The response illustrated in 
their paper is not biphasic but a double, pre- 
sumably negative, wave. Terminal rebound 
was eliminated by manipulating the pa- 
rameters of the stimulus current so that slow 
Group C pain fibers were not excited. 

By analogy, different kinds of stimuli may 
excite fibers or processes which differ in trans- 
mission rate and which, in conjunction with 
the differential responsiveness of peripheral 
membranes, may affect the latency of positive 
as well as negative phases of the SPR. Evi- 
dence that stimuli have differential effects on 
these components was put forward long ago 
by Forbes (1936) and more recently by Edel- 
berg (1963) and Edelberg and Wright (1964). 

It seems likely that the final mediator of 
both components is the sweat glands, since 
iontophoresis of hyoscyamine abolishes both 
negative and positive phases. However, it has 
been noted that when the sweat glands were 
recovering from the effects of hyoscyamine, 
the negative response appeared earlier than 
the positive. This can possibly be interpreted 
in terms of a higher threshold to excitation of 
the membrane responsible for the positive re- 
sponse. This explanation may also account 
for Wilcott’s (1958b) finding that exsanguina- 
tion tends to abolish the positive wave and 
not the negative wave, on the assumption that 
the membrane responsible for the positive 
wave is more susceptible to hypoxia. Quite 
independent support for this view on differen- 
tial thresholds is provided by the studies of 
Yokota et al. (1959) concerning effects of 
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THE PSYCHE IN PSYCHOPHYSICS: 


A SENSORY-DECISION THEORY ANALYSIS OF THE EFFECT OF 
INSTRUCTIONS ON FLICKER SENSITIVITY AND 
RESPONSE BIAS* 


W. CRAWFORD CLARK 
New York State Psychiatric Institute, and Department of Psychiatry, Columbia University 


Instruction-induced changes in flicker thresholds measured by traditional psy- 
chophysical procedures may reflect changes in sensory sensitivity or in response 
bias. In a group of 16 psychiatric patients, a facilitating set, in contrast to an 
inhibiting set, increased the proportion of flicker responses to both a physically 
intermittent light (“hits,” p<.01) and to a continuous light (“false afirma- 
tives,” p < .01). Analysis of the data by the method of constant stimulus sug- 
gested a change in the flicker threshold; however, analysis of the same data by 
Sensory (statistical) decision theory demonstrated that sensory sensitivity (d’) 
was unchanged, and that only the Ss’ response bias or subjective criterion (x,) 
was altered. The results suggest that differences in sensory thresholds, which are 
often reported between control and experimental groups, are likely to reflect a 
difference in attitude towards the subjective costs and values of the various 


decision outcomes. 


Sensory thresholds obtained by the tradi- 
tional psychophysical techniques of limits and 
constant stimulus can be altered by nonsen- 
sory variables such as mental sets induced by 
the experimenter’s (Z’s) instructions. For ex- 
ample, facilitating instructions which induced 
the subject (S) to respond “flicker” to the 
first momentary unsteadiness of any portion 
of a test patch lowered the threshold to flicker; 
while inhibiting instructions which induced S 
to respond “flicker” only when the entire test 
patch remained unsteady for a considerable 
period of time raised the threshold to flicker 
in normal observers (Holland, 1961; Knox, 
1945; Landis & Hamwi, 1954). When reas- 
suring (facilitating) instructions were given, 
threshold differences between brain damaged 
patients and controls on flicker fusion, spiral 
aftereffect, and reversible figure tasks disap- 
peared, suggesting that an initial attitudinal 
difference had been removed by the instruc- 
tions (Lodge, 1961; Mayer & Coons, 1960). 
The question arises, did the instructions af- 
fect sensory sensitivity as the change in 
threshold implies, or did they influence re- 


1 The author extends sincere thanks to Mrs. Jane 
Courten Brown for collecting the data. This research 
was supported in part by Grant G 9594 from the 
National Science Foundation and Grants MH 03616 
and MH 07279 from the National Institutes of 
Health, United States Public Health Service. 


sponse bias? Or, perhaps, both? The classical 
psychophysical procedures fail to answer this 
question because, as Goldiamond (1958) has 
pointed out, the thresholds obtained by these 
methods confound sensory and response proc- 
esses. 

The problem posed by the effect of instruc- 
tions on threshold is of general importance 
since other nonsensory variables, such as s’ s 
personality or his attitude towards the social 
demand characteristics of the test situation, 
may influence thresholds determined by the 
method of limits and constant stimulus. S 
“psychological contamination” of what is ust- 
ally thought to be a purely sensory threshold 
may be responsible for a number of contro- 
versies. For example, many researchers have 
interpreted a raised sensory threshold in psy- 
chiatric patients as an indication of neuro- 
sensory dysfunction, while others hold that 
such differences reflect personality variables 
(Granger, 1953). A second widely held view 
is that raised thresholds reflect the infu- 
ence of unconscious processes in perception 
(Brown, 1961). However, Goldiamond (1958) 
offers cogent arguments to the contrary aM 
maintains that both perceptual defense and 
perceptual vigilance are caused by response 
biases. Finally, the concept that behavior can 
be governed by cues which are below aware 
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ness depends upon the existence of a valid 
threshold measure of “subliminal” stimuli; 
such thresholds are seldom obtained, and al- 
ternative interpretations of the 

are equally reasonable (Eriksen, 1960). These 
controversies suggest that traditional psycho- 
physical procedures are invalid indicators of 
perception, particularly when a small number 
of observations are taken on untrained, atti- 
tudinally biased Ss. 

Three recently developed techniques do suc- 
ceed in separating the effects of sensory sensi- 
tivity and response bias on the threshold. 
Goldiamond and Hawkins (1958) secured a 
pure measure of response bias (but not of 
sensory sensitivity) by obtaining verbal “rec- 
ognition” responses to blank stimuli. Clark 
and Brown (1963) obtained a pure measure 
of sensory sensitivity (but not of response 
bias) in attitudinally biased Ss by using the 
forced-choice technique (Blackwell, 1953). 
The disadvantage of these procedures is that 
they fail to allow a simultaneous measure- 
ment of both sensory sensitivity and attitudi- 
nal bias. This difficulty is solved by the sta- 
tistical-decision theory of signal detection 
(Swets, Tanner, & Birdsall, 1961), a proce- 
dure which provides two numerical indices of 
observer performance: a relatively pure meas- 
ure of sensory sensitivity (d’), which remains 
unaltered in the face of manipulation of non- 
sensory variables such as payoffs, and a meas- 
ure of the position of the subjective criterion 
(x.), which indexes S’s attitudinal bias inde- 
pendently of his sensory sensitivity. This pro- 
cedure, described in greater detail below, thus 
offers a direct means of testing the effect of 
instructions on sensory sensitivity and re- 
Sponse bias. 


HYPOTHESES 


1. The proportion of “seen” to “not seen” 
responses to threshold stimuli are altered by 
instructions to hospitalized psychiatric pa- 
tients. Instructions implying that normal peo- 
ple see a relatively large number of stimuli 
(facilitating set) increases the number of 

seen” responses, while instructions implying 
that normals do not report stimuli unless they 
ate certain of their presence (inhibiting set) 
creases the number of “seen” responses. 


2. Analysis of these data by the traditional 
psychophysical procedure of constant stimu- 
lus yields a lower sensory threshold under the 
facilitating set than under the inhibiting set. 

3. Analysis of the same data by the sen- 
sory-decision technique demonstrates that sen- 
sory sensitivity (d’) is not altered by instruc- 
tions, but that S's subjective criterion (x,) 
shifts as his attitudinal bias towards the vari- 
ous decision outcomes is altered. 

Sensory-Decision THEORY 

Sensory-decision theory has been rigorously 
described (Green, 1960; Luce, 1963; Swets, 
1964; Swets et al., 1961), and only a simpli- 
fied outline will be given here. The theory as- 
sumes that background interference, termed 
noise, is always present in amounts that 
vary randomly over time (the left-hand dis- 
tribution in Figure 1). Thus, when Æ presents 
a blank, noise alone is present; and when Æ 
presents a stimulus, signal plus noise is pres- 
ent (the right-hand distribution in Figure 1). 
Since the magnitude of the noise varies ran- 
domly over time, S’s problem in a psycho- 
physical experiment is to decide whether a 
sensory experience was more likely to have 
been caused by signal plus noise, or was more 
likely to have arisen from noise alone. In or- 
der to make his decisions in a consistent man- 
ner, S chooses a cut-off point, a criterion mag- 
nitude of sensory experience. l 

The sensory-decision theory assumes that 
each sensory experience may be represented 
by a number or a set of numbers which, de- 
pending upon how abstract one chooses to be, 
may be thought of as (a) descriptors of quali- 
tative attributes of the sensory i 
(e.g., hue, saturation, and brightness), (6) fre- 
quency of neural impulses, or (c) a value of 
the likelihood ratio (described below). How- 
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OBSERVATION (X) 


Fic. 1. The probability density functions of noise and 
signal plus noise. 
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ever represented, this abstraction is termed 
an observation. This observation variable, x, 
varies continuously along a single dimension, 
the observation or decision axis. The decision 
axis, or any monotonic transformation of it, 
may be treated as a likelihood-ratio axis, which 
is independent of any particular sensory char- 
acteristic of signal and noise. The likelihood 
ratio is a number obtained by dividing the 
probability that a particular observation 
value (x) arose from the signal-pulse-noise 
distribution by the probability that the same 
observation resulted from the noise distribu- 
tion. In Figure 1, for example, the likelihood 
ratio equals .43 at Point A, and 1.0 at B. 

S is treated as a statistical decision maker 
by the theory. He must assign the observa- 
tion, x, to either the noise or the signal-plus- 
noise distribution. To do this, he is assumed 
to establish a likelihood-ratio criterion on the 
basis of some objective or goal. (These de- 
cision rules are discussed later.) If the ob- 
servation is greater than the criterion, he says 
“yes.” If a stimulus was present, he scores a 
hit; and if a blank was present, he obtains a 
false affirmative. In Figure 1, the conditional 
probability of a hit appears as the area under 
signal-plus-noise distribution to the right of 
the subjective criterion (xe), and the condi- 
tion probability of a false affirmative appears 
as the area under the noise distribution to the 
right of the subjective criterion. These two 
values are sufficient to determine d’, where d’ 
is defined as the difference between the means 
of the two distributions expressed in terms of 


Optimal likelihood-ratio criterion = 
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their standard deviation. The measure of sig- 
nal detectability (d’) is a function of both the 
stimulus strength and the observer’s sensory 
sensitivity. The following points should be 
clear from Figure 1: (a) False affirmatives 
are as important as hits in determining sen- 
sory sensitivity. (Thus d’ is not equivalent to, 
although it resembles, the just-noticeable dif- 
ference of classical psychophysics which is 
typically based solely on the hit rate.) (b) 
The likelihood-ratio criterion and d’ are in- 
dependent of each other. (c) For a given d@, 
there is a narrowly limited set of conditional 
probability values which the hit and false 
affirmative pairs can assume as the criterion 
shifts. In Figure 1, for example, only those 
pairs of possible values (represented by areas 
under the two curves to the right of the cri- 
terion), which are generated by moving the 
criterion through all values of «, are admis- 
sible. 

In addition to the two empirical measures 
of observer performance just considered, sen- 
sory-decision theory specifies the optimal like- 
lihood-ratio criterion for an idealized observer. 
This critical value of the likelihood ratio can 
be specified under a wide variety of objectives 
or decision goals and for many of the param- 
eters describing a detection situation. Two 
possible decision goals will be considered: (a) 
S maximizes expected value and (b) S maxi- 
mizes the number of correct decisions. If the 
observer chooses a decision rule which maxi- 
mizes expected value, then the following for- 
mula is appropriate: 


A priori probability of a blank 


A priori probability of a stimulus 


x Value of a correct rejection + Cost of a false affirmative (1) 


Value of a hit + Cost of a miss 


Money is usually used to manipulate the pay- 
off matrix. However, analysis of human behav- 
ior often suggests that subjective probability 
and subjective utility affect S’s decision. This 
may alter the particular likelihood-ratio cri- 
terion value, but a likelihood-ratio criterion is 
still the optimum decision rule. A second pos- 
sibility is that S disregards utility and chooses 
the decision goal of maximizing the percent- 
age of correct decisions. In this case, hits and 


correct rejections are equal in value while 
errors are equally abhorred, and, as should be 
clear from Equation 1, the a priori probabili- 
ties alone determine the optimal value of the 
likelihood-ratio criterion. 


PROCEDURE 


The visual display used in an earlier experi- 
ment (Clark, Rutschmann, Link, & P 
1963) was slightly modifed to provide 
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single, centrally located stimulus, a neon bulb 
mounted behind opal glass with a white card- 
board surround, All other conditions, in- 
cluding duration of the light (1.7 sec.) and 
luminance (4.2mL.), remained unchanged. Æ 
presented either a physically intermittent light 
(stimulus) or a physically continuous light 
(blank). The task of S was to decide after 
each presentation whether the trial or ob- 
servation interval had contained an intermit- 
tent or a continuous light. Since the con- 
tinuous and intermittent pulses were matched 
for Talbot luminance, identification of the 
intermittent stimulus required the detection 
of at least one dark interval in the pulse 
train. In sensory-decision terminology, the 
intermittent light represents signal plus 
noise while the steady light is noise alone. 
Each S’s approximate threshold was deter- 
mined at the beginning of the session by the 
discontinuous method of limits, using five 
ascending and five descending trials. This 
individually tailored threshold frequency was 
kept constant over the two instructional sets. 
In each session, 70 intermittent stimuli and 
30 steady blanks were presented in random 
order. Ss were informed of the a priori prob- 
ability of stimulus occurrence (.7), but not 
whether their responses were right or wrong. 

Ss were 16 male patients at the New York 
State Psychiatric Institute, mostly diagnosed 
as schizophrenic, with an average age of 22 
years. Psychiatric patients rather than normal 
Ss were used because an earlier study (Clark 
& Brown, 1963) had suggested that patients 
were more vulnerable than normals to the 
social demand characteristics of the experi- 
mental situation. 
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The instructions were as follows: “After I 
say, ‘Ready,’ either a steady or a flickering 
light will appear. Depending upon how it 
looks, answer, ‘Flicker,’ or, ‘Steady.’ Even 
if you are not sure, you must guess. Seventy 
per cent of the time the light will be really 
[physically] intermittent, but at such a rate 
that you will not always see it flicker. 
Whether the light is really steady or flicker- 
ing on each trial is determined entirely by 
chance.” The remaining instructions intro- 
duced one of the two sets. Facilitating in- 
structions: “Most people have very good 
vision. They see the lights as flickering most 
of the time. Please do as well as possible.” 
Inhibiting instructions: “Most people are 
careful about what they see. They do not 
report the light as flickering unless they are 
fairly certain. Please do as well as possible.” 

Each S received one of the instructional 
sets during the first session, and the other 
set during the second session 1 day later. 
Ss and conditions were counterbalanced for 
sequence effects. 


RESULTS 


The results averaged over Ss, expressed 
both as number of flicker and steady re- 
sponses and as conditional probabilities, ap- 
pear in Table 1. The total number of af- 
firmative (flicker) responses decreased from 
821 under the facilitating set to 606 under 
the inhibiting set, this change being reflected 
in both the number of hits (“ficker” to an 
intermittent light) and in the number of false 
affirmatives (“‘flicker” to a continuous light). 
The change in the number of hits and the 
change in the number of false affirmatives 


TABLE 1 
DECISION OUTCOMES UNDER Two INSTRUCTIONAL SETS 


Responses under the facilitating set 


Responses under the inhibiting set 


“Flicker” 


“Steady” 


“Flicker” “Steady” 


Stimulus: Intermit- 
tent pulse 


Hits: 718 (,64) 


Misses: 402 (.36) 


Hits: 554 (.49) Misses: 566 (.51) 


Blank: Continuous False affirmatives: Correct rejections: | False affirmatives: | Correct ta! 
pulse 103 (.21) 377 (.79) 52 (11) 428 (.89) 
Total responses 821 779 606 hes 


Note,—Conditional probabilities appear in parentheses. 
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produced in each S$ by instructional sets 
were tested for significance by the Wilcoxon 
matched-pairs signed-ranks test (two-tailed). 
The number of hits under the facilitating 
set was significantly higher (p< .01) than 
the number of hits under the inhibiting set, 
and the number of false affirmatives under 
the facilitating set was significantly higher 
(p < .01) than under the inhibiting set. No 
order effects were apparent in the data. 

The conditional probabilities, that is, the 
probability of a flicker response given that 
the intermittent light had been presented 
(hit), or that the continuous light had been 
presented (false affirmative), appear in Table 
1. These conditional probabilities have been 
used to interpret the data according to (a) 
the constant-stimulus method, and (b) the 
sensory-decision procedure. 

Method of constant stimulus. Psychophysi- 
cal ogives were obtained by plotting the con- 
ditional probability of a flicker response (the 
hits and false affirmatives from Table 1) 
against stimulus frequency, the continuous 
stimulus being treated as equivalent to an 
intermittent stimulus of a high frequency (42.0 
cps). The location of this equivalent frequency 
along the abscissa and hence the slope of the 
psychometric functions are based on com- 
plete psychophysical ogives determined by five 
stimulus frequencies and a blank on similar 
Ss under identical stimulus conditions (Clark 
et al., 1963). The stimulus threshold, defined 
as the stimulus frequency which yields a 50% 
probability of successful detection of flicker 
without correction for chance success, was 
estimated to be lower (36.0 cps) under the 
facilitating set than under the inhibiting set 
(33.3 cps). 

It might be argued that a correction for 
chance success would remove the influence of 
instructional set on the constant-stimulus 
threshold, since it would compensate for the 
change in the S’s tendency to make a “yes” 
response to stimuli which were truly below 
threshold. The correction for chance success 
is effected by the following equation (Swets 
et al., 1961): 


Corrected per cent success 


_ Hits — False affirmatives 
-~ 1 — False affirmatives 


[2] 
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Correction for chance success or “guessing” de- 
creases, but fails to remove, the difference in 
threshold between the facilitating set (33.6 
cps) and the inhibiting set (32.5 cps). More- 
over, this correction can be shown to be in- 
appropriate, since the formula requires the 
false affirmative rate to be independent of the 
hit rate. In fact, the contrary is true. Hit and 
false affirmative rates have been demon- 
strated to be positively correlated (Swets et 
al., 1961), a finding which is reflected in the 
results of the present experiment (Table 1). 

Sensory-decision procedure. Treatment of 
the conditional probabilities of a hit and of a 
false affirmative (Table 1) according to the 
sensory-decision model revealed the follow- 
ing: Under the facilitating set, the measure 
of sensory ‘sensitivity (d') was 1.2, and the 
measure of attitudinal bias (the likelihood- 
ratio criterion) was 1.3. These values place 
the subjective criterion (xo) .9 standard devi- 
ation to the right of the mean of the noise 
distribution (indicated at C in Figure 1). 
Under the inhibiting set, the measure of 
sensory sensitivity (which is free to take 
any value) remained unchanged (d'= 1.2); 
but the likelihood-ratio criterion was 2.1, 
placing the subjective criterion 1.3 standard 
deviation to the right of the mean of the 
noise distribution (Point D). This difference 
of .4 standard deviation between subjective 
criterion locations is statistically significant 
since, with d’ remaining constant, only the 
other index of observer behavior, Xes could 
reflect the significant differences between hit 
rates and false alarm rates which were ob- 
tained (Table 1) under the two instructional 
sets. f 

The empirically determined likelihood-ratio 
criteria may be compared with that of an 
ideal observer. If the decision goal is to 
maximize the percentage of correct decisions 
(and ignore payoffs), or, if the decision g0a 
is to maximize subjective utility and the 
values and costs of correct rejections am 
false affirmatives are subjectively equal to 
those of hits and misses, the optimal value of 
the likelihood ratio is .43. If the ideal ob- 
server had a d’ equal to 1.2, then the subjec- 
tive criterion (x,) located at A in Figure 1, 
would be associated with a hit rate of 90, 
and a false positive rate of .52. 
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The data (from Table 1) are plotted in the 
form of a receiver-operating-characteristic 
curve in Figure 2. Since this curve is gener- 
ated by moving the likelihood-ratio criterion 
along the decision axis (portrayed graphically 
in Figure 1) for a single value of d’, it is 
also known as an isosensitivity curve. The 
conditional probability of a hit is plotted 
along the ordinate, and the conditional prob- 
ability of a false affirmative along the 
abscissa. The conditional probabilities (from 
Table 1) obtained under the facilitating set 
yield the point (.21, .64) while those obtained 
under the inhibiting set yield the point (.11, 
49). Both points fall on the same receiver- 
operating-characteristic curve, d'= 1.2, dem- 
onstrating that instructions were without 
effect on sensory sensitivity. 

The mean likelihood-ratio criterion used by 
the Ss is the slope of the tangent to the 
receiver-operating-characteristic curve, the 
slope being 1.3 under the facilitating set and 
2.1 under the inhibiting set. The location of 
the optimal likelihood-ratio criterion is also 
portrayed in Figure 2. If the idealized ob- 
server chooses the decision goal of maximizing 
the number of correct answers, then the 
optimal value of the likelihood-ratio criterion 
is .43, and the ideal point on the receiver- 
operating-characteristic curve is (.52, .90) 
when d' = 1.2 (A in Figure 2). 

Swets et al. (1961) pointed out that when 
the receiver-operating-characteristic curve is 
plotted on double-probability paper (not 
shown), it appears as a straight line with a 
slope 1.0, if the noise and signal-plus-noise 
distributions are normal and of equal vari- 
ance, Plotted in this manner, the observed 
data points yielded a slope of 1.03, lending 
empirical support to the assumption of equal 
variance made earlier. 


DISCUSSION AND CONCLUSIONS 


Analysis of the results by means of sensory- 
decision theory demonstrated that instruc- 
tional sets affect the location of S’s criterion, 
and hence his response bias, but not his 
Sensory sensitivity. Facilitating instructions 
decreased the optimal value of the likelihood- 
tatio criterion by increasing the subjective 
value of a hit and cost of a miss, relative 
to the subjective value of a correct rejection 
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Fic. 2, Isosensitivity or receiver-operating-charac- 
teristic curve for d’=1.2. (A represents the theo- 
retical location of an ideal observer according to the 
expected-value model.) 


and cost of a false affirmative (Equation 1). 
An increase in the proportion of “yes” re- 
sponses was associated with this relatively 
liberal criterion. Inhibiting instructions had 
the opposite effect, and increased the optimal 
value of the likelihood-ratio criterion, A de- 
crease in the proportion of “yes” responses 
was associated with this relatively conserva- 
tive criterion. In spite of the fact that instruc- 
tions altered the location of the subjective 
criterion, instructions were without effect on 
the measure of sensory sensitivity (d’), for the 
observed hit and false affirmative points fell 
on the same isosensitivity curve under both 
instructional sets. 

On the other hand, analysis of the same 
data by the method of constant stimulus, 
both with and without correction for chance 
success, demonstrated that facilitating in- 
structions decreased the threshold to flicker, 
while inhibiting instructions increased the 
threshold. Since the threshold is generally 
regarded to be the reciprocal of sensory sensi- 
tivity, one might conclude that the instruc- 
tions had actually influenced S’s sensitivity. 
However, this seems rather unlikely, since 
the same Ss as well as the same stimulus fre- 
quencies were used under both instructional 
sets, It is more likely that the change in 
flicker threshold reflected the effect of in- 
structions on the attitudinal or response bias 
of S. It may be concluded that even with 
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correction for chance success, the constant- 
stimulus threshold is not a valid indicator of 
sensory sensitivity because it is influenced by 
nonsensory variables. The same conclusion 
applies to other psychophysical procedures, 
such as limits and adjustment, in which 
the subjective criterion of S varies in an 
undeterminable manner. 

The relation between the location of the 
empirically determined subjective criteria and 
the criterion of the ideal observer reveals 
that instructions are unable to overcome 
certain basic attitudes of S towards the test 
situation. According to the expected-value 
model (Equation 1), if an attitudinally un- 
biased S receives neutral instructions, the 
optimal likelihood-ratio criterion is .43. If the 
instructions are inhibitory, this value in- 
creases; and if the instructions are facilitory, 
the likelihood ratio decreases to a value below 
43. The results of the present experiment 
demonstrate that the likelihood-ratio cri- 
terion under the facilitating set is above, not 
below, .43, suggesting that Ss have a gener- 
ally cautious bias which even facilitating in- 
structions fail to overcome. The location of 
the subjective criterion suggests that Ss find 
the cost of a false affirmative plus the value 
of a correct rejection to be greater than the 
value of a hit plus the cost of a miss. Such 
a high or conservative criterion means that 
Ss chronically fail to give sufficient weighting 
to the value of a “yes” response. This excess 
caution has also been reported in normal Ss, 
and seems to reflect a culturally determined 
attitudinal bias (Swets et al., 1961). This 
seems reasonable, for to admit that one can- 
not perceive something which actually is pres- 
ent (a miss) is an honest admission of a 
sensory incapacity, while to claim to see 
something which actually is not there (a false 
affirmative) is to be caught in a lie. Thus Ss 
in general, and perhaps patients in particular, 
reserve affirmative responses for sensory ex- 
periences which are relatively clear, that is, 
well above the noise level. For example, Clark 
and Brown (1963) found that although psy- 
chiatric patients did not differ from normals 
with respect to sensory sensitivity (deter- 
mined by the forced-choice procedure), the 
patients had a much higher method-of-limits 
threshold because they set their subjective 
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criterion at a lower stimulus frequency. In 
other words, patients, even though they had 
actually received neutral instructions, be- 
haved as though they had received inhibiting 
instructions. 

A number of studies in the literature sug- 
gest how personality and situational variables 
can influence S’s attitude to produce either 
a liberal or a conservative criterion setting, 
even though the instructions are neutral, 
Edwards (1961) pointed out that instructions 
in psychological experiments are ambiguous, 
since they fail to specify the exchange rate 
between correct answers and errors, making a 
single optimal strategy impossible and per- 
mitting a variety of psychological variables 
to influence S’s decisions in a number of 
undetermined ways. Personality variables can 
produce either a liberal or a conservative 
criterion. Some Ss are chronically motivated 
to take a chance in risky decision-making 
situations (Atkinson, 1957). Even with neu- 
tral instructions, such Ss could be expected 
to set a liberal criterion, that is, to choose 
a small likelihood-ratio criterion. Other Ss are 
intrinsically unwilling to make a decision on 
the basis of inadequate information (Binder, 
1958). They can be expected to set a con- 
servative subjective criterion, that is, to 
choose a large likelihood-ratio criterion. In 
addition, the personality of S will interact 
with the personality of Æ (Masling, 1960), 
and with the perceived social demand charac- 
teristics of the experimental situation (Ome, 
1962; Rosenthal, 1963). Thus, there are @ 
large number of complexly interacting psy- 
chological and social variables which can 
influence S’s attitude, and hence the location 
of his subjective criterion in either a liberal 
or a conservative direction. 

Since attitudinal bias can influence sensory 
thresholds determined by traditional psycho- 
physical procedures, it is clear that care must 
be taken in interpreting the results of expe 
ments in which the thresholds of pathological 
and normal Ss are compared. For example, 95 
who were anxious (Goldstone, 1955; Jones; 
1958), aged (summarized by Landis 
Hamwi, 1956), lower in intelligence (Colgan, 
1954), or poor perseverators (Biesheuvel, 
1938) have been found to have a higher 
threshold to visual flicker than controls. Psy- 
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chiatric patients, especially those diagnosed 
as schizophrenic, also appear to have a higher 
threshold to flicker (Saucer & Sweetbaum, 
1958) and to threshold intensity (Granger, 
1957). Since it has been demonstrated that 
visual sensitivity is affected by physiological 
variables such as anoxia and drugs (Landis, 
1954), these differences in visual thresholds 
between pathological and control groups are 
often ascribed to a variety of neurosensory 
origins, such as physiological imbalance, a 
defect in receptor photochemistry, or neuro- 
pathology of the visual system. However, it 
should now be apparent that the higher 
thresholds of the experimental groups might 
have a nonsensory origin, since one could 
expect schizophrenic, anxious, neurotic, and 
aged Ss to have a more cautious attitude 
towards the test situation. Although it is true 
that threshold differences between patients 
and controls are not always found with tradi- 
tional psychophysical procedures (Dillon, 
1961; McDonough, 1960; Riciutti, 1949), 
such results should not be interpreted as dem- 
onstrating that traditional psychophysical 
procedures can yield a pure sensory threshold 
in the hands of the right experimenter, but 
rather that the attitude of the patient towards 
the experimental situation may be such that 
his criterion position happens to coincide 
with that of the normal controls. For ex- 
ample, reassuring instructions have been 
shown to remove a difference in spiral after- 
effect and flicker-fusion thresholds which had 
been found previously in brain damaged pa- 
tients (Lodge, 1961; Mayer & Coons, 1960), 
and a friendly Æ could produce the same 
effect without specifically using reassuring 
instructions. 

It seems clear that when thresholds are 
being obtained on attitudinally biased sub- 
jects, the sensory-decision procedure should be 
used; not only to insure that a pure measure 
of sensory sensitivity is obtained, but to 
examine the subjective criterion as an inter- 
esting attitudinal measure in its own right. 
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Despite recent judgements to the contrary, considerable evidence exists sug- 
gesting that glutamic acid does play a significant role in cognitive behavior. 
Its effects upon intelligence are evident both in normal and in retarded persons, 
Variations in results are believed to be related to (a) types of Ss employed, 
(b) the condition under which the drug is administered, (c) the method of 
assessing change in cognitive behavior, (d) method of drug administration, and 
(e) type of glutamic acid employed. Relevant physiological literature is reviewed. 


Basic biochemical research pointing to a 
central role in neural functioning for glu- 
tamic acid (Weil-Malherbe, 1936) has been 
followed by a continuing series of biological 
studies further implicating this substance in 
metabolic and neural processes. In 1944, 
Zimmerman and Ross reported positive effects 
of glutamic acid administration upon the 
learning behavior of the white rat. Later be- 
havioral studies reported that glutamic acid 
had positive effects upon animal learning; 
upon intelligence and personality in retarded, 
psychoneurotic, and normal persons; and 
that it suppressed abnormal EEG phenomena 
and controlled epilepsy. Glutamine (a metabo- 
lite of glutamic acid) was also reported to 
have positive effects upon retardate intel- 
ligence. 

It was quickly recognized, however, that 
many of the early, positive reports of the 
effects of glutamic acid upon retardate intel- 
ligence were derived from methodologically 
weak studies. Consequently, experimentally 
oriented psychologists became active with, 
presumably, better controlled studies. This 
research, however, generally failed to find 
positive effects of glutamic acid upon retardate 
intelligence. In 1960, Astin and Ross reviewed 
the literature on the effect of glutamic acid 
upon retardate intelligence and concluded that 
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no beneficial effect of glutamic acid adminis- 
tration upon retardate intelligence had been 
demonstrated. They regarded the optimistic 
conclusions of the early reports as attributable 
to a particular defect in experimental design, 
specifically, a failure to employ control groups. 

Astin and Ross’ paper made no claim to 
being comprehensive and, in fact, it was not; 
we have located a total of 25 additional re- 
search papers (representing 50% of the avail- 
able literature) on the effect of glutamic acid 
administration upon mental retardation pub- 
lished before 1960, the year of Astin and Ross’ 
review. An examination of these papers sug- 
gests that their inclusion would have, of neces- 
sity, altered Astin and Ross’ conclusions, since 
a complete compilation of glutamic acid stud- 
ies reveals that as many “positive” as “nega- 
tive” studies employed control groups. How- 
ever, we do not fault Astin and Ross primar- 
ily because of the incompleteness of their 
review. We would dispute their conclusions 
even if the discussion were limited to the ma- 
terial they cited. We take issue with Astin and 
Ross primarily because of salient aspects of 
the research with glutamic acid which they did 
not consider, but which, had they done so, 
might have led them to a different conclusion. 
Specifically, we wish to focus upon the fol- 
following methodological problems: (a) the 
characteristics of subject samples used Pt the 
studies reporting positive versus negative re- 
sults, (6) the manner of administration of 
glutamic acid in the studies reporting positive 
versus negative results, (c) salient questions 
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TABLE 1 


‘Turery-Taree STUDIES CLASSIFIED IN TERMS OF 
RESULTS AND Use OF CONTROLS 


Positive results Negative results 
Albert, Hoch, & Bergius (1954) 
Waelsch (1946, 1951) | Ellison, Fuller, & 
Foale (1952 Urmston (1950) 
Head (1955) Ernsting (1949) 
Kurland & Gilgash Kantor & Boyes (1951) 
(1953) Kerr & Szurek (1950) 
Zimmerman, Burge- Loeb & Tuddeni 
meister, & Putnam (1950) 
(1948) Lombard, Gilbert, & 
Control Donofrio (1955) 
McCulloch (1950) 
Milliken & Standen 
(1951) 
Oldfelt (1952) 
Quinn & Durling 
(1950b) 
Zabarenko & Chambers 
gua 
Züblin & Lutz (1953) 
(f = 6) (f = 13) 
de la Fuente Muniz, 
Zuniga, & Yanowski 
1950 
Delay, Pichot, Puech, 
&P 51) 
Harney (1950) 
Hoven (1951) 
Kane (1953) 
No control | Levine (1949) 
Müller (1953a, 1954) 
Qun & panics 
Schw&bel (1950, 1952) 
Zimmerman & Burge- 
meister (1950) 
Zimmerman, Burge- 
meister, & Putnam 
(1949a, 1949b) 
(f = 14) (f = 0) 


relating to placebo effects, (d) the manner in 
which intellectual change was assessed, and 
last, (e) the nature of the patients’ environ- 
ment while they were being medicated with 
glutamic acid. 


GLUTAMIC ACID AND RETARDATE 
INTELLIGENCE 
Controls 


In their review of glutamic acid and re- 
tardate intelligence, Astin and Ross concluded 
that “positive effects tend to be reported in 
studies not employing a control group”; and 
that “the few positive studies employing 
controls contain methodological flaws . . . 
[p. 433].” The suggestion that a lack of con- 
trolled observations and the presence of 
methodological errors were responsible for the 
reports of beneficial effects of glutamic acid 
is not original with Astin and Ross, but is 
rather a note which sounded through the 
earliest reviews of this work (Anonymous, 
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1951; Arbitman, 1952; Hirai, 1951). It is 
not a claim, however, which stands close 
scrutiny, particularly in the light of the more 
recent literature. 

Astin and Ross examined the studies listed 
in Table 1. These studies were classified into 
those yielding positive versus negative results; 
and those which employed a control group 
versus those which did not. The computed 
chi-square, with a correction for continuity, 
was “significant at the .001 level (x? = 12.99), 
indicating that positive results tend to be 
related to a lack of controls [p. 429].” 

Astin and Ross’ table, however, can be 
corrected in several ways. First, there is every 
indication that Astin and Ross meant to ex- 
clude the extensive literature on the effect 
of glutamic acid upon the intelligence of 
normal subjects and nonretarded psychiatric 
subjects, and to limit their review to “studies 
with mentally deficient subjects [p. 429],” 
yet five articles which did not employ retarded 
subjects were included in their Table 1 
(Delay, Pichot, Puech, & Perse, 1951; 
Kantor & Boyes, 1951; Müller, 1953a, 1954; 
Schwöbel, 1950). Second, there are a consider- 
able number of articles (at least 25) which 
were published prior to Astin and Ross’ re 
view, but which were not cited by them. 
Third, Astin and Ross were inconsistent iM 
their classification of studies which did oF 
did not have a control group (“liberally de- 
fined as those [studies] in which a similarly 
diagnosed group was studied in the absence 
of glutamic acid medication.”) An article by 
Kerr and Szurek (1950) was cited as having 
employed a control group when, in fact, it did 
not, and a study by Schwobel (1952) was 
cited as not having a control group when, 
in fact, it did. Further, Astin and Ross lisi 
Albert, Hoch, and Waelsch (1946) and 
Zimmerman, Burgemeister, and Putnam 
(1948) as having employed control groups) 
these studies employed each subject as 
own control. Judging from their discussion © 
these studies, Astin and Ross regarded this 
method of control as satisfying their defini- 
tion. However, Harney (1950) used the same 
method, but was listed as “no control.” Finally, 
in at least one instance, a study was listed in 
the table twice (Zimmerman et al., 1949a, 15 
a preliminary report of the same data m 
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TABLE 2 


Guvramc Acip Stupies wiru Rerarpep Sunyects, 
CLassuriep IN Tres or RESULTS, AND PRESENCE 
OR ABSENCE OF CONTROL GROUPS 


Positive results Negative results 
Hoch, & Bergius (1954) 

Waelech (1946) Elison, Fuller, & 
Albert. H A Urmston (1950) 

Warlech (1951) Erasting (1949) 
Coatini Q Harbaver & Delloor 
Lt x, Benoit, (1982) 

(1953) Head (1955)* 
Du (95s) Hellstrom & Metin 
Foale leg (1952) 
Fortunato Leedham (1955) 
a951 Loeb & Tuddenham 
Controle | Harney (1950) (1950) 
Jat 3 Gilber bard, Gilbert, & 
» & Donofrio (1955 
„Williams (1953) McCulloch (1950) 
Koch (1954) k 
Kurland & Gilgash 1951) 

(1953) felt. (1952) 
Schwdbel Nag «4 & Stevens 
Zimmerman & Burge- 7) 

meister (1959a) Quinn & Durling 

merman, Burge- ) 

“meister, & Putnam = | Zal & Chambers 

(1948)* a 2 

Züblin & Lutz (1953) 


a (1956; Donnadieu 


No con- 
trols 


* The difference in IQ scores between 


oetibeant by Head men 


control group was reported as 
sate te ee Mee E E TOA Gk tae Gees We 
Astin and Ross ( (1960). 

gen pte nr Burge- 
meister, tnam, 

* Positive results with a nonmongol sample; 
negative results with a mongol sam; 

Completed report of an earlier paper (Zimmerman, Burge- 


meister, & Putnam, 1949a). 


Zimmerman et al., 1949b). In line with the 
above discussion, the table presented by Astin 
and Ross has been corrected (see Table 2). 
As may be seen, there are now almost as 
many “positive results, control? (N = 14) 
as “negative results, control” (N = 16). The 
chi-square (corrected for continuity) of 2.93 
is not statistically significant. Thus it would 
seem that the simple presence or absence of 
a control group does not differentiate the 
studies which obtained positive results from 
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those which did not. The question of what 
differentiates studies reporting positive results 
from studies reporting negative results, then, 
remains to be answered. 

Moreover, we believe that because it is 
based on certain false assumptions, no classifi- 
cation of studies, such as in Table 1 or 2, 
can ever result in a proper appreciation of 
the literature. For example, after alleging a 
relationship between positive results and the 
absence of control groups, Astin and Ross 
(1960) proceeded to dismiss the results of 
the “no control” studies without further com- 
ment. Presumably, their assumption was that 
in the absence of a control group, positive re- 
sults may be attributable to placebo effects of 
glutamic acid; to environmental stimulation 
that might be attendant to a retarded sub- 
ject’s being included in a glutamic acid study; 
to natural increments in IQ as a function of 
growth and experience; to regression effects; 
or to chance. In fact, Astin and Ross did not 
utilize the available evidence to investigate 
whether the results of the “no control” studies 
were attributable to these variables; they 
simply regarded the absence of a control 
group as adequate grounds for dismissal of a 
study. Yet it can be demonstrated, by infer- 
ence, at least, that it is most unlikely that the 
IQ increments observed in the “no control” 
studies which reported positive results were 
due to factors other than glutamic acid. Of the 
30 glutamic acid studies with mental re- 
tardates using control groups, only 4 studies 
reported significant rises in intelligence in the 
control groups (Contini-Poli, 1950; DuPlessis, 
1953; Ellson, Fuller, & Urmston, 1950; Quinn 
& Durling, 1950b). The mean rises in IQ of 
the control groups in these four studies were, 
respectively, 1.52, 2.15, 1.9, and 4.5 IQ 
points.? In contrast, considerably greater IQ 
gains are reported in the studies which did 
not employ control groups, but which reported 
significant test gains in subjects fed glutamic 
acid: six such studies reported mean IQ gains 
of 5 to 10 IQ points (Clapp, 1949; de la Fuente 
Muniz, 1950; Goldstein, 1954; Zimmerman 
& Burgemeister, 1950, 1959b; Zimmerman, 
Burgemeister, & Putnam, 1949b); two re- 

2 Astin and Ross (1960) incorrectly cited Zaba- 


renko and Chambers (1952) and Ernsting (1949) as 
reporting significant rises in IQ in control groups. 
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ported gains of 10-20 points (Kane, 1953; 
Levine, 1949), and only one reported a mean 
gain of less than five points (Quinn & Durling, 
1950a). The much greater size of the IQ gains 
following glutamic acid therapy compared to 
the IQ gains of control groups suggest that 
the gains in the experimental groups are not 
attributable to environmental factors other 
than glutamic acid. Consequently, any sum- 
mary dismissal of a study on the grounds that 
a “no treatment, control” group was not em- 
ployed does not seem to be justifiable. 

A case in point is a paper by Kane (1953), 
which was one of those dismissed without 
discussion by Astin and Ross on the grounds 
that it failed to include a “no [glutamic acid] 
treatment” control group. Kane reported that 
glutamic acid was “of decided value in cases 
in which the triad: emotional immaturity, 
mental retardation, and brain insult” ap- 
peared; and that it might be of some value 
to the psychotic and epileptic, but that it 
was of no value in other cases of retardation. 
Kane’s sample consisted of 106 “immature, 
brain-injured, retarded” children who gained 
an average of 17 IQ points after 3 months of 
treatment with glutamic acid. She reported no 
change as a result of glutamic acid treatment 
in a number of other samples whose members 
manifested none, one, or two characteristics 
of the triad. It seems most unlikely that 
Kane’s results could have been due simply 
to placebo effects since no studies in the 
literature reported effects anywhere near so 
great following glutamic acid treatment, let 
alone placebo; and none have reported such 
dramatic effects on so large a number of per- 
sons. The Kane study might well have served 
to initiate well-controlled, methodologically 
sophisticated studies to investigate the va- 
lidity of her claim, but in fact it did not. 
No later study has utilized a sample similar 
to that employed by Kane. 

It is possible that studies reporting positive 
results from glutamic-acid therapy may capi- 
talize on regression effects resulting from an 
initial selection of subjects with low IQs. Re- 
gression effects are usually controlled by using 
a matched control group. However, Ellson, 
Fuller, and Urmston (1950) noted that in at 
least one instance (Zimmerman, Burgemeister, 
& Putnam, 1947) a control group was chosen 
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in such a manner that a regression effect 
magnified the difference between a glutamic 
acid and a control group. Upon reanalysis 
of the data (Ellson, Fuller, & Urmston, 
1950), the difference in favor of the glutamic 
acid group was found to be lessened, although 
still present. 

The problem of regression effects is left 
unresolved in studies which select subjects 
on the basis of IQ and do not use a control 
group. Studies which obtained positive results 
after glutamic acid treatment but which did 
not employ a control group have, therefore, 
been closely examined to ascertain the extent 
to which regression effects may have con- 
tributed to their positive results. Such an 
examination has suggested, however, that re 
gression effects probably did not play a major 
role in the “positive results, no control” stud- 
ies, for the following reasons: First, we have 
already shown that in the 30 studies in which 
control groups were employed, and in which 
placebo effects, practice effects, and any pos 
sible regression effects were presumably work- 
ing in the same direction to produce elevated 
scores, only four control groups showed sig- 
nificant rises upon retest; and, in three of the 
four cases, the rises were less than 2.15 points. 
In contrast, the rises in the “no control, 
glutamic acid” studies were, with one excep- 
tion (Quinn & Durling, 1950a), greater than 
five points. Second, although it is sometimes 
difficult to be certain from the written reports, 
it appears that most studies did not select 
their subjects on the basis of IQ except insofar 
as they had been catagorized as retardates. 
The usual procedure seems to have been to 
select subjects on some basis other than 1Q, 
for example, institutionalized retardates, non- 
institutionalized retardates, mongoloids, fa- 
milial retardates, etc., and then determine ` 
pre- and post-treatment IQs. This selection 
procedure does not induce regression effects. 
Only two “positive results, no control” studies 
appear to have chosen subjects from the lower 
end of a test distribution (Goldstein, 1954; 
Quinn & Durling, 1950a). Consequently, pru- 
dence would suggest that these two studies 
be eliminated from further consideration m 
the ensuing discussions. It should be noti 
that with the omission of the two studies, the 
value of the chi-square based on the con 
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tingencies in Table 2 changes from 2.93 to 
2.16, remaining not statistically significant; 
thus, the conclusion remains tenable that the 
simple presence or absence of a control group 
does not differentiate the studies which ob- 
tained positive results from those that did not. 


Subject Samples 


The studies reporting positive results 
tended to use noninstitutionalized retardates 
who were either living in their parents’ homes 
or attending private boarding schools, whereas 
the studies reporting negative results tended 
to employ institutionalized subjects. 

Of the positive studies listed in Table 2 (ex- 
cepting Goldstein, 1954; Quinn & Durling, 
1950a) six studies employed institutionalized 
subjects (Desclaux, Benoit, & Aussagel, 1953; 
Foale, 1952; Hoven, 1951; Kurland & 
Gilgash, 1953; Schwöbel, 1952; Tschakert, 
1953). In four positive studies, the composi- 
tion of the subject sample cannot be inferred 
with certainty (Bergamini, 1956; Borak & 
Borak, 1957; Clapp, 1949; de Moragos, 1949), 
and in one study (Fortunato & Canella, 1951) 
the sample was evenly mixed. In each one of 
the remaining 18 positive studies, the sub- 
jects were noninstitutionalized retardates who 
were living in their parents’ homes or at- 
tending private boarding schools. 

Of the negative studies listed in Table 2, 
three employed samples composed exclusively 
or almost exclusively of noninstitutionalized 
subjects (Head, 1955; Kerr & Szurek, 1950; 
Vasquez & Farago, 1957); in five studies, the 
composition of the samples cannot be inferred 
with certainty (Donnadieu & Achalle, 1951; 
Ernsting, 1949; Harbauer & DeBoor, 1952; 
Leedham, 1955; Shchlembelev, 1959); and in 
one study, it was evenly mixed (the mongol 
sample of Zimmerman, Burgemeister, and 
Putnam, 1949b). The remaining 13 studies 
employed samples composed exclusively or al- 
most exclusively of institutionalized subjects. 

Summarizing, of the positive studies in 
which the composition of the sample can be 
ascertained, 18 employed noninstitutionalized, 
and 6 employed institutionalized retardates; 
while of the negative studies, 3 employed non- 
institutionalized retardates and 13 employed 
institutionalized retardates (x? = 10.03, p < 
005). 
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This tendency of studies reporting positive 
findings to employ outpatients rather than 
inpatients had been noted earlier (Oldfelt, 
1952); however, no investigation of the phe- 
nomenon has been reported. Several explana- 
tions suggest themselves. First, the parents 
of the outpatients, aware that something 
special is being done for their child (since it 
is they who administer the drug at home) 
may watch their child closely for some slight 
sign of positive change, which is then, as 
Quinn and Durling (1950b) put it, “capital- 
ized on and implemented by [the] doting 
parents.” This would seem to be a definite 
danger when, as in the early Zimmerman 
studies (Zimmerman et al., 1947, 1948), the 
patients and parents knew the nature of the 
drug. The danger, of course, disappears en- 
tirely in cases where glutamic acid can be 
shown to have a more positive effect than a 
placebo. Second, the outpatient retardates, 
living at home, are more likely to be in a 
favorable situation than institutionalized 
retardates—they have access to more individ- 
ualized care, more personal attention, and a 
more even environment, all of which may 
create the kind of positive atmosphere that 
might make them more responsive to thera- 
peutic intervention. Third, the fact that 
the noninstitutionalized retardates have the 
resources to maintain themselves outside of 
an institution suggests better adjustment, 
greater maturity, or a lesser degree of dis- 
ability than institutionalized retardates and, 
consequently, a greater amenability to thera- 
peutic intervention. Finally, there may be a 
debilitating and therapeutically inimical effect 
of institutionalization upon certain classes of 
retarded persons. 

The question of why certain investigators 
chose to work with outpatients while others 
worked with inpatients may relate to experi- 
mental versus clinical orientations of the in- 
vestigators. In some cases, at least, the use 
of institutionalized patients was determined 
by a desire for greater experimental rigor, 
specifically, a wish to keep the environmental 
conditions similar for all subjects (Ellson, 
Fuller, & Urmston, 1950; Oldfelt, 1952; 
Zabarenko & Chambers, 1952). No recogni- 
tion was evident, in these instances, that 
this procedure might result in less thera- 
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peutically amenable patient samples. On the 
other hand, some clinically oriented investi- 
gators seem to have employed noninstitu- 
tionalized subjects out of a wish to work with 
the more promising varieties of retarded pa- 
tients, at the same time displaying a relative 
lack of concern over the problem of environ- 
mental control (Zimmerman & Burgemeister, 
1950). 


Individual Differences 


Another difference between the positive 
and negative studies which may relate to a 
clinical versus experimental orientation in- 
volves attitudes towards individual differences. 
The clinician who usually deals with individ- 
uals is quite naturally concerned with consti- 
tutional phenomena governing individual dif- 
ferences. The experimental psychologist, on 
the other hand, has been typically mainly 
interested in the effects of externally manipu- 
lated variables on a sample of subjects. 

In this context, it is pertinent that the posi- 
tive studies often compared the responses to 
glutamic acid of various diagnostic categories 
of retardates; for example, mongols versus non- 
mongols, primary versus secondary retar- 
dates, severe versus mild retardation, etc., 
(Bergamini, 1956; Borak & Borak, 1957; 
Clapp, 1949; de la Fuente Muniz et al., 
1950; de Moragos, 1949; Kane, 1953; 
Koch, 1954; Lafon et al., 1952; Zimmer- 
man et al., 1948, 1949b, 1950, 1959a, 
1959b). When differences were found, it was 
usually concluded that glutamic acid may 
affect some varieties of retardation more than 
others (Borak & Borak, 1957; Kane, 1953; 
Lafon, Faure, & Bascou, 1952; Zimmerman 
et al., 1959a; 1959b). Thus, concern with 
individual differences tends to be present in 
the methodological approach of the positive 
studies. On the other hand, concern with 
differential responses to glutamic acid as a 
function of type of mental retardation is 
present in only a few of the negative studies 
(Gollnitz, 1952; Vasquez & Farago, 1957; 
Zabarenko & Chambers, 1952; Zimmerman, 
Burgemeister, & Putnam, 1949b). If glutamic 
acid is therapeutically effective with only a 
segment of the retarded population, this fact 
would be disguised by the methodological ap- 
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proach employed in the majority of the “negas 
tive” studies. ae 
Glutamic Acid Administration: The Problem 
of Dosage Level A 


In administering a drug to patients a 
long period of time, clinicians often follow a 
standard procedure in determining 
level: the initial dose of the drug is 
and is gradually increased up to the point 
where the patient shows a toxic response to 
the drug. The dosage level is then reduced to 
below the level of toxicity. In this manner, & 
maximal drug response is obtained from all 
patients. This method of drug administration 
may be termed an “individualized” method, 
that is, the level of medication a patient re 
ceives is determined by his own reaction t 
the drug. The method may be modified in 
that the dose may not be increased to toxicity 
but may be varied depending upon the pa 
tient’s response to lower than toxic 
Throughout their numerous publications, 
Zimmerman and his colleagues have stressed 
the necessity of adhering to the individualized | 
method of administering glutamic acid if posi- 
tive results are to be obtained. They have been 
echoed in their plea by Contini-Poli (1950), 
de la Fuente Muniz et al. (1950), and others. 

However, the individual or clinical ap 
proach to drug administration greatly coma 
plicates matters for scientists who are atten J 
ing to evaluate the efficacy of drugs. Astin 
Ross (1960) argued against the © 
method on the grounds that if the dosages 
increased to toxicity, it becomes apparent who 
is in the placebo and who is in the experi- 
mental group. Also, if the members of an 
experimental sample are on various d ie 
can be argued that individual differences in 
response to the drug are attributable to me 
differences in dose level. It seems more “scien 
tific,” then, to give all subjects the same 
amount of the drug, even though the in ; 
response to the drug is allowed to vary across 
individuals. The procedure of giving all sub- 
jects the same dosage level of a drug imo 
the pharmacological facts in favor of & supet- 
ficially clean-cut experimental design. A 

Studies which administer the same dose 4 
all subjects are presented with a pee 
They may administer a small dose to all s 
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jects, whereupon they risk not obtaining the 
sought-after behavioral response, Or, they 
may give all subjects a very high dose and 
try to maximize a positive response, while 
ignoring the problem of toxic effects. Za- 
barenko and Chambers (1952), for example, 
chose the latter course and gave all subjects 
40 grams—a massive dose, considering half 
their subjects were children 5-11 years of 
age. While they reported that “in 10,000 pa- 
tient days of medication, there were at most 
half a dozen instances of gastric disturbance,” 
there appears to have been definite toxic ef- 
fects apparent on intelligence testing, since, 
in a later report on the same data (Chambers 
& Zabarenko, 1956), they stated that after 
glutamic-acid therapy there was a “lessening 
in ability to inhibit responses, which led to 
impulsive errors.” “Impulsivity,” it must 
be noted, is one of the behavioral effects 
of an overdose of glutamic acid. Since 
Zabarenko and Chambers (1952) reported no 
positive effect of glutamic acid on intelligence 
testing, one wonders if their method of ad- 
ministration of glutamic acid was to blame. 
Similar observations may be made of the 
study of Ellson, Fuller, and Urmston (1950) 
in which subjects (all of whom were children) 
were maintained on fairly high dosage levels 
(30 grams). Ellson et al. reported no positive 
effect of glutamic acid upon intelligence, and 
at the same time reported that the glutamic 
acid group evidenced significantly more im- 
pulsive, error-prone behaviors on one of the 
cognitive tests. 

Comparison of the positive and negative 
studies indicates a statistically significant 
tendency for the positive studies to vary the 
dosage level for individuals, while the negative 
studies tend to administer the same dosage 
level to all subjects (x? = 7.11, p < .01). Spe- 
cifically, of the positive studies listed in 
Table 2 (once more excepting Goldstein, 
1954; and Quinn & Durling, 1950a), 9 studies 
administered a fixed dose to all subjects 
(Albert, Hoch, & Waelsch, 1946; Bergamini, 
1956; Desclaux et al., 1953; DuPlessis, 1953; 
Foale, 1952; Kurland & Gilgash, 1953; 
Levine, 1949; Pabst & Wurst, 1952; Tscha- 
kert, 1953). The remaining 20 used the in- 
dividualized method of drug administration. 
Of the negative studies, 16 gave all subjects 
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the same dose, while only 6 studies used an 
individualized method of drug administration 
(Bergius, 1954; Hellstrom & Melin, 1952; 
Kerr & Szurek, 1950; Milliken & Standen, 
1951; Vasquez & Farago, 1957; Zimmerman 
et al., 1949b). 


Glutamic Acid versus Glutamate Salts 


The early positive studies of glutamic acid 
treatment of retardates used the free form of 
the natural glutamic acid. For reasons not 
entirely clear, but which probably had to do 
with convenience of administration, the so- 
dium salt of glutamic acid, monosodium 
glutamate, later came into common use. The 
assumption of course was that glutamic acid 
and monosodium glutamate were equivalent 
in their physiological effect. The available evi- 
dence now suggests that this assumption is 
not a valid one. Pond and Pond (1951), for 
example, reported that the salts of glutamic 
acid increased, while the free acid decreased 
epileptic activity both clinically and in the 
EEG. In addition, no study which used mono- 
sodium glutamate reported positive effects of 
the drug on the intellectual functioning of 
retardates (Ellson, Fuller, & Urmston, 1950; 
Loeb & Tuddenheim, 1950; Milliken & Stan- 
den, 1951; Pallister & Stevens, 1957). 

Only one paper reported an experimental 
comparison of the effects of the free acid and 
any one of its salts upon mental functioning. 
Albert, Hoch, and Waelsch (1951) found 
significant positive effects of the free, natural 
form of glutamic acid upon intellectual func- 
tioning of retardates. These investigators then 
selected 10 mental defectives who had re- 
sponded to the free, natural acid with sig- 
nificant IQ rises and gave these patients a 
placebo, until their scores had returned to 
pretest level. At this point, the retardates 
were administered potassium glutamate for 
3 to 4 months without their evidencing 
any signs of positive or negative intellectual 
change. 


The Problem of Intellectual Assessment 


Zimmerman and his colleagues (1948, 
1949a, 1949b, 1950) originally held that 
intellectual ability was increased by glutamic 
acid in a way that was assessable with psy- 
chometric devices. Albert, Hoch, and Waelsch 
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(1946, 1951) held that intellectual ability as 
such was not increased by glutamic acid but 
that certain factors were enhanced which en- 
abled a person to effectuate what intellectual 
ability was already present. These factors 
were concentration, attention, motivation, 
and persistence—what Wechsler (1944) has 
termed the “nonintellective” or “tempera- 
mental” components of intelligence. It fol- 
lows from the position of Albert et al. that 
glutamic acid could effect cognitive behavior 
in a way which would be clinically apparent, 
but which would not necessarily be reflected 
in increased IQ scores. Not until quite late 
did Zimmerman (1959a, 1959b) accept the 
view of Albert et al. Finally, there were a 
number of writers who held that no aspect 
of intellectual functioning was affected by 
glutamic acid. Thus, the fact that a division 
existed among the glutamic acid adherents 
as to whether a clinical or a psychometric 
method should be employed to evaluate intel- 
lectual changes was lost in the larger issue 
of whether there was an effect of glutamic 
acid at all. 

The evidence for or against any of these 
positions is difficult to evaluate, primarily 
because of the difficulty in ascertaining the 
scope of the clinical procedures, if any, 
which were employed in the various studies. 
It is not uncommon to find the comment 
“no clinical changes were noted” without any 
description of the procedures used to observe 
clinical phenomena. The studies reporting 
positive psychometric results, however, do 
tend to report associated clinical changes. For 
example, of the 31 positive studies listed in 
Table 2, 25 instituted clinical assessment 
procedures (all except Desclaux et al., 1953; 
DuPlessis, 1953; Harney, 1950; Jaeger-Lee, 
1953; Kane, 1953; Zimmerman et al., 
1949b), and all 25 reported some kind of 
clinically observable intellectual changes, 
generally in the direction of increases in 
mental activity, alertness, spontaneity, and 
motivation. On the other hand, if glutamic 
acid affects clinical aspects of cognitive be- 
haviors independent of psychometric change, 
there should be studies which report clinical 
changes in the absence of psychometric 
changes. Several such studies, in fact, exist 
(Bergius, 1954; Gollnitz, 1952; Harbauer & 
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DeBoor, 1952). However, it cannot be denied 
that reports of positive clinical change may 
be systematically biased in those instances in 
which the judge is aware of which subjects 
are receiving the active medication. 

Thus, an aspect of the psychometric assess- 
ment procedure which deserves mention in- 
volves the tester’s or clinician’s knowledge of 
whether he is examining a control or a 
glutamic acid patient. Astin and Ross stressed 
the desirability of keeping the tester “blind” 
in order to preserve his objectivity. It was 
partially on the basis of more frequent “blind 
testing” that they based their judgment of 
the superior methodology of the negative 
studies; blind testing has been employed by 
only four positive studies (Albert, Hoch, & 
Waelsch, 1951; Contini-Poli, 1950; Zimmer- 
man & Burgemeister, 1959a, 1959b), but by 
nine negative studies (Bergius, 1954; Ellson, 
Fuller, & Urmston, 1950; Harbauer & De- 
Boor, 1952; Hellstrom & Melin, 1952; Lom- 
bard, Gilbert, & Donofrio, 1955; McCulloch, 
1950; Milliken & Standen, 1951; Pallister & 
Stevens, 1957; Zabarenko & Chambers, 
1952). 


Intellectual Stimulation and Glutamic Acid 


Glutamic acid studies have followed a gen- 
eral pattern: subjects were first evaluated 
intellectually, then given glutamic acid and 
reevaluated; the degree of intellectual change 
over the course of treatment was usually 
compared with that of a “no treatment 
group. The environmental stimulation inter- 
vening between the beginning and end 0 
glutamic acid or placebo treatment was held 
constant for both groups. This has isay 
meant that, with the exception of the brie 
periods each day during which the experi- 
mental or control medication was taken, the 
subjects were not subjected to any pom 
training experiences, that is, they continue 
with their daily routines as before. i i 

The difficulty with this method is that 1 
does not conform to the general theoreti 
view of the origins of abilities upon wee 
estimates of intelligence are based. A 
are generally thought to be acquired as : 
function of experience, reward, etc. (Ree 
son, 1954; 1956). It is meaningless to aS 
an intelligence test which requires demons 
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tion of particular abilities and kinds of in- 
formation if the testee has never been exposed 
to situations which permit him to acquire 
these skills and information. Similarly, it is 
meaningless to assess the effects of a drug 
on intelligence with tests of specific skills 
and information when the subject has had 
no opportunity during the course of the drug 
administration to acquire new information or 
skills. At least one investigator (Clapp, 1949) 
has made the clinical observation that 
glutamic acid seems to have little effect unless 
it is embedded in a program of formal 
training, 

The literature does provide several ex- 
amples of studies in which children who were 
attending special classes for retardates were 
given glutamic acid therapy; such a class 
may be thought of as providing an intel- 
lectually enriched environment. It is es- 
pecially suggestive that two of the most 
successful glutamic acid studies employed 
students who were enrolled in classroom situ- 
ations: Levine (1949) reported a mean rise 
of 13 IQ points for retarded pupils fed 
glutamic acid who were enrolled in special 
classes for the deaf. Contini-Poli (1950) 
reported a mean rise of 15 IQ points for 
slow school children fed glutamic acid who 
were enrolled in special classes, as opposed 
to a 2-point rise for control pupils on a 
placebo. Other investigators have also re- 
ported statistically significant IQ rises or sig- 
nificant improvements on cognitive tasks after 
glutamic acid treatment was given to children 
who were receiving training in special schools 
(DuPlessis, 1953; Harney, 1950; Koch, 
1954; Schwobel, 1952). On the other hand, 
at least four studies of retarded children in 
Special classes observed no effect of glutamic 
acid treatment upon cognition (Harbauer & 
DeBoor, 1952; Lombard et al., 1955; Oldfelt, 
1952; Zublin & Lutz, 1953). However, Old- 
felt’s (1952) results should be regarded with 
Caution since half of the subjects were not 
given their postexperimental psychometric 
evaluations until 3 months after treatment 
had ended. If Oldfelt’s study is set aside, a 
total of seven studies which have adminis- 
tered glutamic acid in a context of formal 
training have reported positive results 
(Clapp, 1949; Contini-Poli, 1950; DuPlessis, 
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1953; Harney, 1950; Koch, 1954; Levine, 
1949; Schwöbel, 1952) as opposed to three 
studies which reported negative results (Har- 
bauer & DeBoor, 1952; Lombard et al., 1955; 
Zublin & Lutz, 1953). Thus, it seems that 
embedding glutamic acid in a program of 
classroom training improves the chance of 
obtaining an increment in IQ. 


Effect of Glutamic Acid upon Retardate 
Intelligence: A Summary 


Several important differences discriminate 
the positive and negative glutamic acid 
studies. The positive studies tended (a) to 
employ outpatient as opposed to inpatient 
retardates; (b) to pay more attention to 
individual differences among retardates in 
response to the effects of glutamic acid; (c) 
to employ varying dosage levels of glutamic 
acid for different individuals rather than 
giving all subjects the same amount of the 
drug; (d) to medicate patients with the free 
form rather than with the salt of glutamic 
acid; and (e) to utilize clinical assessments 
in addition to psychometrics, rather than re- 
lying solely on the latter to determine intel- 
lectual functioning. About an equal number 
of positive and negative studies employed 
control groups and used placebos. There are 
two areas in which the negative studies, taken 
as a group, may be said to evidence greater 
experimental rigor than the positive studies: 
first, more negative than positive studies em- 
ployed “blind” psychological testing. Second, 
(a point not mentioned earlier) more nega- 
tive than positive studies attempted to equate 
glutamic acid and the placebo for taste.* 

In those cases in which control groups were 
employed, the reported procedures do not 
seem to differentiate in any major way be- 
tween the positive and negative glutamic acid 
studies. For example, no instances were found 
in which the experimental and control groups 
were not (a) chosen from the same parent 
population, (b) of equivalent chronological 


3 Jt should be noted that of the studies reporting 
attempts to equate the placebo and glutamic acid 
for taste, Loeb and Tuddenham (1950) and Oldfeld 
(1952) reported that they were unable to do so. 
Ellson et al. (1950) reported only that the two 
tastes were “similar.” It is unclear what degree of 
success the other studies enjoyed in this regard. 
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age, (c) kept under (apparently) equivalent 
environmental conditions. However, there 
were three studies in which the several re- 
tardate groups under investigation were not 
of equivalent intelligence levels on initial 
testing (McCulloch, 1950; Zabarenko & 
Chambers, 1952; Zimmerman, Burgemeister, 
& Putnam, 1947). In the first two studies, no 
effect of glutamic acid was reported; in the 
last, there was a regression effect, but, as 
discussed previously, the superiority of the 
glutamic acid group remained even after the 
regression effect had been taken into account. 
In Oldfelt’s (1952) negative study, half of 
the subjects did not receive their final psycho- 
metric evaluation until 3 months after treat- 
ment had ended; thus, it seems possible that 
any beneficial effect of treatment upon cogni- 
tion might have been lost in the interim 
between end of treatment and final evaluation. 

It appears then that the effectiveness of 
glutamic acid treatment in mental retardation 
which has come to be considered a closed ques- 
tion with an answer in the negative might well 
be reopened. The application of both methodo- 
logical rigor and clinical sophistication is re- 
quired in any psychological research venture; 
it is clear that the two have not been com- 
bined in any optimum balance in the history 
of the investigation of the effects of glutamic 
acid upon mental retardation. 


RESULTS OF Gtutamic-Acip ADMINISTRA- 
TION ON THE INTELLECTUAL PROCESSES 
OF NORMAL PERSONS 


While it is generally unrecognized, a large 
number of well-designed studies have reported 
positive effects of glutamic acid upon the 
cognitive functioning of normals. In fact, the 
efficacy of glutamic acid in improving the 
cognitive capacity of normal subjects is ap- 
parently more pronounced than its effect with 
retardates, 

The first study to report positive effects 
of glutamic acid treatment upon normals was 
that of Milliken and Standen (1951), who ad- 
ministered monosodium glutamate to a sam- 
ple of boys under 10 years of age and a 
sample of boys over 10 years of age. An 
individualized method of drug administration 
was employed (the boys received from 20-36 
grams per day). By any technical standard, 
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the study was well designed: control groups, 
matched for age, received placebos matched 
for appearance and taste with glutamic 
acid; treatment was continued for 3 months, 
after which the experimental and placebo 
medication was switched. The study was 
double-blind. No significant differences be- 
tween placebo and glutamic administrations 
were found in the younger sample, but, in the 
older sample, the glutamic acid group was 
superior to the control group on the Arith- 
metic ( < .05) and Digit Symbol (p < .01) 
subtests of the Wechsler-Bellevue intelligence 
test. After the medication was switched, the 
current glutamic acid group showed superior 
performance to the current control group on 
the Comprehension, Block Design, and Digit 
Symbol subtests (p < .025). 

Subsequently, Miiller (1955), using six 
groups of normal adults, reported that a 
group of subjects fed 6 grams daily of the 
free, natural form of glutamic acid showed 
significantly greater resistance to fatigue and 
increased learning rates on simple tasks when 
compared to five other groups of subjects 
fed either lesser amounts of the same sub- 
stance, or glutamic acid in the protein-bound 
form. Working with monozygotic twins, 
Müller (1957) fed half his subjects glutamic 
acid supplement and gave their twins another 
amino acid; he observed that the glutamic- 
acid-fed twins showed superior performance, 
and when the medication of the two groups 
was switched, the glutamic acid group again 
showed superior performance. Kergl (1952) 
reported superior intelligence-test performance 
in 57 normal adults and children fed glutamic 
acid as compared with 10 normals fed a pla- 
cebo. Koch (1954), employing children of nor- 
mal intelligence, also found superior perform- 
ance on intelligence tests in a group fed glu- 
tamic acid compared to a placebo-fed control 
group. Schwöbel and Tamm (1952) gave nor- 
mal adults glutamic acid and reported im- 
proved performance on simple cognitive tasks, 
decreases in reaction time, improved motiva- 
tion, and increased energy levels; howeve!; 
they employed no control group. d 

Other positive studies have oon 
motivational and personality traits of group 
of normal subjects fed glutamic acid, Tr 
groups of normals fed placebos matched 
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taste and appearance with glutamic acid. 
Müller (1953a) reported that glutamic acid 
led to “increased drive, and heightened abil- 
ity to utilize one’s potential. Changes in self- 
concept occur so that the person feels more 
competent, and acts more competently.” 
Studying convicted criminals of normal IQ, 
Miiller (1953b) reported that the Rorschachs 
of the glutamic acid groups became less con- 
stricted than those of the placebo group. 
Kronicke, and Zinits and Kuhr, in unpub- 
lished studies cited by Mehl (1956), both 
reported positive effects of glutamic acid upon 
the motivational and drive states of high- 
level normals (Kronicke employed university 
professors and graduate students; Zinits and 
Kuhr used university students). 

On the other hand, there have also been 
reports that glutamic acid had no effect on 
the performance of normal children or adults 
(Kane, 1953; Head, 1955; Mehl, 1956). 
Interestingly, neither Head nor Mehl em- 
ployed the individualized method of drug 
administration. 

Thus, most of the studies with normals 
support the view that glutamic acid has an 
influence on mental functioning. The German 
writers generally have expressed the opinion 
that glutamic acid achieves its effects by 
increasing drive, and by positively affecting 
other personality characteristics which, if they 
are not intellectual traits in the narrow sense 
of the term, are certainly related to intel- 
lectual performance (Kronicke, and Zinits & 
Kuhr, cited by Mehl, 1956; Miiller, 1953a, 
1953b, 1957; Schwobel & Tamm, 1952). 
Alternatively, glutamic acid may increase ca- 
pacity to deal with simple or speed-related 
cognitive tasks (Müller, 1955; Schwébel & 
Tamm, 1952). 


SIGNIFICANCE OF GLUTAMIC ACID IN 
PHYSIOLOGICAL FUNCTIONING 


The voluminous biological literature on 
glutamic acid has focused heavily upon such 
theoretical issues as the role of glutamic acid 
in carbohydrate metabolism, in neural func- 
tioning, and in the removal of waste products. 
However, the psychiatric and psychological 
literature on the role of glutamic acid in 
'ntellectual functioning rarely makes reference 
to the biological literature. The mistaken im- 
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pression is given that the only rationale for 
expecting glutamic acid to affect mental func- 
tioning is the purely empirical one that such 
an effect has been claimed in the past. Conse- 
quently, it seems appropriate at this point to 
discuss the physiological effects of glutamic 
acid which may be relevant to intellectual 
functioning. 

1. One of the most comprehensively pre- 
sented theories in regard to the behavioral 
effects of glutamic acid was Weil-Malherbe’s 
(1950) proposal that the behavioral effects 
of glutamic acid were adrenergic; that is, 
that glutamic acid stimulated the release of 
adrenaline, thus stimulating increased mental 
and physical activity. However, Strecker 
(1957), reviewing the biochemical and physio- 
logical literature since Weil-Malherbe’s paper, 
concluded that a number of mechanisms must 
necessarily be involved in promoting the 
varied physiological effects of glutamic acid, 
although he did not deny that adrenergic 
mechanisms were among them. 

2. A generally accepted role of glutamic 
acid in the brain is the removal of intra- 
cellular ammonia; a convicing and well- 
documented statement of this thesis was pre- 
sented by Weil-Malherbe (1950). Glutamic 
acid may exercise its effect upon mental func- 
tioning through its role in the removal of 
toxic waste, such as ammonia, from the brain. 

3. There is evidence that in emergency 
conditions when glucose is in short supply, 
such as hypoglycaemic coma, glutamic acid 
substitutes for glucose in supporting respira- 
tion in brain tissue (Cravioto, Massieu, & 
Izquierdo, 1951; Dawson, 1949; Himwich & 
Sullivan, 1956; Himwich & Peterson, 1958; 
Waelsch, 1951; Waelsch, Schwerin, & Bess- 
man, 1949). Consequently, it seems possible 
that glutamic acid may support the respira- 
tion of brain tissues in chronic conditions in 
which brain functioning has been impaired 
by reduced glucose supply. 

4. Glutamic acid plays a central role in 
a complex pattern of protein and carbo- 
hydrate metabolism in nervous tissue gener- 
ally and in the brain in particular; some 
excellent reviews of the physiological litera- 
ture testify to this point (Waelsch, 1951; 
Tower, 1959). In fact, there are few metabo- 
lites in carbohydrate and protein metabolism 
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which are not related by some process to 
glutamic acid. Consequently, glutamic acid 
would be implicated in conditions in which 
disturbances of cognitive functioning are 
traceable to disturbances in the major meta- 
bolic processes supplying nervous tissue. 

5. In 1943, Nachmansohn and his col- 
leagues (Nachmansohn, Cox, Coates, & Ma- 
chado, 1943; Nachmansohn & Machado, 
1943) established the mechanism of the 
acetycholine cycle, a biochemical process 
which provides energy for neural transmis- 
sion; and later Nachmansohn, John, and 
Waelsch (1943) accorded glutamic acid a 
major role in the cycle. Subsequently, 
glutamic acid became centrally implicated in 
many aspects of neural metabolism and func- 
tioning (Ames, 1956; Coxon & Peters, 1950; 
Davies & Krebs, 1952; Krnjević, Randić, & 
Straughn, 1964; Strecker, 1957; Terner, Eg- 
gleston, & Krebs, 1950; Tower, 1955, 1957; 
Way & Sutherland, 1963). If, as seems prob- 
able, disturbances in glutamic acid metabo- 
lism may lead to disturbances of cortical 
neural transmission, then these disturbances 
might also lead to disruption of cognitive func- 
tioning. 

6. It is now generally assumed that the 
major locus of action of glutamic acid is in 
the brain. It is known that glutamic acid 
passes the blood-brain barrier and enters the 
brain (Lajtha, Berl, & Waelsch, 1959; 
Roberts, Flexner, & Flexner, 1959; Waelsch, 
1958). However, the brain level of glutamic 
acid remains constant for a given individual 
no matter how much glutamic acid is intro- 
duced into the bloodstream (Friedberg & 
Greenberg, 1947; Schwerin, Bessman, & 
Waelsch, 1950). On the other hand, gluta- 
mine supplementation results in increases of 
the brain levels of glutamine (Waelsch, 1951; 
Weil-Malherbe, 1950). Consequently, it has 
been suggested that glutamic acid may effect 
behavioral change through the effect upon 
the brain of its metabolite, glutamine 
(Waelsch, 1951). Subsequent research has 
indicated that glutamine serves a major role 
in the maintenance of cerebral tissues 
(Strecker, 1957); indeed, there is indication 
that glutamine is essential for the growth and 
maintenance of almost all living cells 
(Meister, 1956). 
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Consequently, some writers (Rogers & 
Pelton, 1957) came to believe that any facil- 
itory effects of glutamic acid upon mental 
functioning may rest in its ability to produce 
glutamine. Rogers and Pelton (1957) found 
that glutamine increased the intelligence test 
scores of retardates significantly more than 
did a placebo, and Beley, Caustier, and 
Olievenstein (1964) also found significant 
differences between the effects of glutamine 
and a placebo on the test behavior of men- 
tally retarded children, Farina, Collinet, and 
Collinet-Marchal (1961) reported IQ im- 
provement following glutamine administration 
in children of normal intelligence, but these 
authors did not employ a control group. 

7. Glutamic acid is metabolized, in part, 
into an amine, gamma-aminobutyric acid 
(GABA). It has recently been suggested that 
GABA plays a major role at postsynaptic 
junctions by coordinating and regulating elec- 
trical activity, and by aiding the depolarized 
nerve to recover and fire once more (Roberts, 
Wein, & Simonsen, 1964). As such, GABA 
inhibits neural fatigue and, conversely, en- 
ables the neural fibers to be receptive to 
continuous stimulation. The ability to resist 
neural fatigue might well underlie the in- 
creased attention, persistence, and ability to 
perform simple repetitive tasks reported to 
follow the administration of glutamic acid 
(Müller, 1955; Schwébel & Tamm, 1952). In 
this connection, it is interesting to note that 
GABA is a monoamine which is metabolized 
by the enzyme monoamine oxydase (MAO). 
MAO has recently been reported to be in- 
versely related to speed of performance of 
simple repetitive tasks in normal adults 
(Klaiber, Broverman, & Kobyashi, 1965). 


CONCLUSION 


A considerable amount of evidence indi- 
cates a role for glutamic acid in cogntivé 
functioning. The sheer weight of this evi- 
dence is surprising; indeed, there is mote 
sound experimental confirmation of the post 
tive effects of glutamic acid than appears tO 
exist for the most of the psychotropic drugs 
now in common therapeutic use. The earlier 
judgments that glutamic acid has little behav- 
ioral effect may be considered to be base 
upon an incomplete or unsound analysis. 
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We have tried to discover and analyze the 
problems which have hampered research. The 
major problem seems to have been a lack 
of rapport among behavioral scientists, cli- 
nicians, and biological scientists in an area in 
which communication would seem imperative 
if a competent research effort is to be 
maintained, 
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This note discusses Greenwald's analysis of the implications of Nuttin’s work 
for the status of the law of effect. The argument that Nuttin’s findings 
constitute a decisive refutation of the hypothesis of automatic action of 
rewards and punishments is examined. It is concluded, largely on methodo- 
logical grounds, that Nuttin’s experiments failed to test the validity of the law 


of effect. 


Greenwald (1966) has taken strong excep- 
tion to the “neglect” of the work of Nuttin 
in discussions of the law of effect. In his view, 
Nuttin’s experiments constitute an as yet 
unrefuted attack on the hypothesis of auto- 
matic action of rewards. After reviewing some 
representative examples of Nuttin’s work and 
considering its implications for the results of 
other investigations, Greenwald expressed the 
hope that his presentation “will lead students 
of learning to look more skeptically at the 
law of effect. . . .” The purpose of the pres- 
ent note is twofold: (a) to point out some 
basic limitations of Nuttin’s experiments 
which disqualify them as critical tests of the 
law of effect and which may perhaps ac- 
count for the “neglect” of which Greenwald 
complains; and (b) to examine Greenwald’s 
attempts to reinterpret existing findings 
and conclusions in the light of Nuttin’s 
analyses, 

Greenwald addressed himself to four prob- 
lems: (a) the nature of the measures used 
by Nuttin; (5) the effects of reward and 
punishment in intentional learning; (c) the 
effects of reward and punishment in inci- 
dental learning; and (d) the spread of effect. 
The same organization is followed in the 
present note. All of Nuttin’s experiments to 
which reference is made were reported in 
his book, Tâche, réussite et échec (1953). 
Major emphasis is to be placed on the specific 
experiments singled out by Greenwald. 


Nuttin’s Measures of Performance 


The law of effect states a principle govern- 
ing the repetition of responses to recurrent 
stimuli. Accordingly, Thorndike and those 
Who followed him used the frequency of 
tepetition of rewarded and punished responses 


as the dependent variables in tests of the 
law. By contrast, in virtually all of the perti- 
nent experiments of Nuttin, subjects were 
required to recall all their previous responses, 
that is, both rewarded and punished ones. 
This measure of recall is not appropriate in 
tests of the law of effect. It is apparent that 
recall of one’s prior responses is not equiva- 
lent to repetition of a response to a recurrent 
stimulus. As Marx (1956, p. 158) pointed out 
in an early discussion of Nuttin’s work, 
repetition and recall are operationally quite 
distinct measures of response strength. As one 
shifts from repetition to recall, the subject’s 
task changes drastically: he is no longer re- 
quired to respond to the stimulus per se but 
must seek to reproduce his own previous 
responses, and these two types of behavior 
may well be functions of different variables. 
A long time ago, Thorndike (1932) sought 
to forestall the very confusion which is 
reflected in the choice of the recall measure: 


The Law of Effect would not lead us to remember 
experiences that were pleasant and to forget experi- 
ences that were painful, but to remember experiences 
that have been pleasant to remember and to forget 
experiences that have been painful to remember, a 
very different matter [p. 458 f.]. 


One need only substitute “responses” for 
“experiences” to see the direct relevance of 
Thorndike’s point to the present question. 
Thus, whatever the merits of the law of effect 
as formulated by Thorndike, Nuttin’s experi- 
ments did not test it. Nuttin himself drew 
a clear distinction between recall and repeti- 
tion when he suggested that under a given 
set of circumstances subjects may or may not 
be disposed to repeat a response which they 
recall. In fact, he took pains to point out 
the great complexity of the conditions which 
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he assumed to determine repetition (pp. 
478-479). Nevertheless, he drew conclusions 
about the validity of the law of effect, that is, 
a law of repetition, on the basis of measures 
of recall. 

Given the distinction between repetition 
and recall, it is a legitimate hypothesis that 
repetition is determined jointly by the acces- 
sibility of prior responses to recall and other 
variables (e.g., the subject’s memory for the 
location of rewards in the series). Such a view 
cannot, however, be supported by substitu- 
tion of recall for repetition measures but 
requires a demonstration of the joint depend- 
ence of repetition on recall and other condi- 
tions of performance. In justifying Nuttin’s 
use of recall instead of repetition measures, 
Greenwald appealed to the distinction be- 
tween learning and performance. Greenwald’s 
treatment of this distinction was vague 
if not confusing. He said that “Nuttin’s 
type of measure is highly justified for the 
purpose of comparing the effects of reward 
and punishment. . . .” but he did not specify 
what the effects are supposed to be on. If the 
concern is with the effects on repetition, then 
repetition must be the dependent variable, 
and the conditions of learning and testing 
must be varied to determine the manner in 
which performance depends on each of these 
and on their interaction. The logic of such 
designs, as they have been developed in 
studies of habit strength and drive as de- 
terminants of performance, has been discussed 
by Spence (1960, pp. 150 ff.). In sum, if 
one is interested in the conditions determining 
the occurrence of Response A (repetition), 
one cannot measure Response B (recall) and 
then assume that A is a function of B plus 
some other nonmanipulated variables. One 
may, of course, be interested only in B as 
such, but one would have to recognize that 
he is not testing laws about the antecedents 
of A. 


Effects of Reward and Punishment in Inten- 
tional Learning 


There is little to be said under this heading. 
One of Nuttin’s experiments was a modifica- 
tion of a classical study by Thorndike (1931, 
pp. 31-32) in which the subject was required 
to choose among alternative “translations” of 
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a series of words and was rewarded for some 
of his choices and punished for others. Nuttin 
found that subjects recalled more of their 
rewarded than punished responses. Nuttin 
then proceeded to show in a number of 
experiments that conditions other than reward 
and punishment may produce differential 
recall. For example, when subjects were 
informed that some of the stimuli to which 
they were responding would recur in the 
experiment whereas others would not, they 
recalled their responses to the former better 
than to the latter. Results such as these 
probably require no more complex explana- 
tion than differential rehearsal, but Nuttin 
preferred to conceptualize them in terms of 
the effects of “persisting task tension.” A 
finding that certain types of instruction in- 
fluence recall says nothing about the processes 
responsible for the effects of reward and 
punishment on repetition (or even recall). 
Greenwald appeared to recognize that this 
kind of experiment has no direct implications 
for the law of effect, and the point need 
not be pressed any further. Thus, Nuttin left 
the problem of reward and punishment in 
intentional learning essentially where he 
found it. 


Effects of Reward and Punishment in Inci- 
dental Learning 


The theoretical issue is joined more 
sharply in relation to the effects of reward 
and punishment in incidental learning. The 
hypothesis of automatic action of reinforce- 
ment implies that rewarded responses should 
be repeated more frequently than punished 
ones even when there is no intention on the 
part of a subject to learn the responses and 
no expectation of a test of repetition. This 
implication was tested in the well-known 
experiment of Wallach and Henle (1941, 
1942) in which ESP instructions were used 
to create conditions of incidental learning. 
There was no difference between rewarded 
and punished responses in either repetition 
or recall. The experiment suffered, however, 
from a failure to control associative inter- 
ference. With this source of difficulty elimi- 
nated, Postman and Adams (1954) found 
both higher repetition and recall for rewarded 
than for punished responses, This study also 
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focused on the learning-performance distinc- 
tion to which reference was made earlier. 
Test instructions concerning repetition of 
rewarded and punished responses were ma- 
nipulated independently in a factorial design. 
Performance instructions had significant ef- 
fects on repetition but not on recall. Correla- 
tions between repetition and recall were 
positive but varied widely in magnitude as a 
function of performance instructions. An 
analysis of the pattern of the subjects’ re- 
sponses indicated that under a given instruc- 
tion the probability of repetition can be pre- 
dicted with only moderate success on the 
basis of (a) the accessibility of the response 
to recall and (b) identification of the re- 
sponse as previously rewarded or punished. 
These analyses underscore the complexity of 
the relationship between recall and repetition. 

Let us now turn to Nuttin’s attack on the 
problem, as exemplified by the study which 
Greenwald singled out for discussion. The 
subjects’ task was to guess the numbers of 
familiar objects in photographs (to the 
nearest five), with half their guesses rewarded 
and half punished. On the test trial, on which 
subjects were required to recall their esti- 
mates, the original stimuli were not presented 
again but were instead named by the experi- 
menter. The numbers of rewarded and pun- 
ished responses recalled were approximately 
equal. Nuttin stated that the stimulus was 
changed in order to prevent the subject from 
making a new estimate instead of recalling his 
old response (p. 318). Thus, in a test of the 
law of effect it is deemed necessary to change 
the stimulus in order to prevent the old class 
of responses from recurring. Nuttin was 
probably in error in assuming that, in view 
of the distinctiveness of the stimulus items, 
the change from the original cards to verbal 
labels was of no consequence. There is evi- 
dence for substantial differences in transfer 
and interference when a similar change is 
made from simple geometric figures to the 
names of these figures (Postman, 1958). In 
sum, an hypothesis about the repetition of 
Specific responses to specific stimuli was sub- 
jected to what purports to be a critical test 
by requiring subjects to give a new class of 
Tesponses to a new set of stimuli. The other 
experiments in Nuttin’s series suffer from 
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the same weaknesses: both stimuli and re- 
sponse requirements were changed freely from 
training to test, In some cases, repeated 
training trials were used and the stimuli 
varied widely in form from trial to trial 
(Exps. RR and RF, pp. 324-332). It would 
be difficult to move further outside the 
boundary conditions of the law of effect in 
considering the effects of reward and punish- 
ment in incidental learning. 

In discussing the divergence between the 
conclusions of Nuttin and of Postman and 
Adams (1954), Greenwald was disposed to 
dismiss the results of the latter as an artifact 
of isolation effects since only 3/25 of the 
subjects’ responses were rewarded. The pos- 
sible effects of isolation were evaluated in 
a subsequent study (Postman & Adams, 
1955) in which the proportion of rewards 
was varied between 4/21 and 18/21. The 
rate of repetition of rewarded responses did 
not vary significantly as a function of isola- 
tion and consistently exceeded that for pun- 
ished responses. These results were in turn 
dismissed by Greenwald on the grounds that 
subjects achieving apparently high scores in 
an ESP situation “must have been impressed 
by their strikingly above-chance ESP ability, 
thus necessarily causing some perceptual en- 
hancement if not (technically) ‘isolation.’ ” 
Greenwald did not say what components of 
the task benefited from the “necessary” per- 
ceptual enhancement—the stimuli, the re- 
sponses, the associations, or the number and 
location of the rewards. One can explain away 
almost any finding by phenomenologizing 
post hoc on behalf of the subject. In this 
case, however, the post hoc explanation was 
offered in the teeth of direct evidence to the 
contrary. Specifically, it was shown that sub- 
jects’ ability to identify their previous re- 
sponses as rewarded and punished varied 
substantially as a function of the manipu- 
lated degree of isolation, the number of cor- 
rect identifications being significantly higher 
when the proportion of rewards was small 
than when it was high. Isolation did not, 
however, significantly influence recall or the 
differential effectiveness of rewards and pun- 
ishments, In the light of these findings, Post- 
man and Adams (1955) emphasized the need 
to distinguish sharply between the isolation 
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of responses, reinforcements, and stimulus- 
response associations. This need still exists. 


Spread of Effect 


Greenwald based his discussion of the 
spread of effect on Nuttin’s studies of sub- 
jects’ memory for the positions of rewards 
in a series, After a training trial in which 
some responses were rewarded and others 
were punished, the subject was required not 
only to recall his previous responses but also 
to indicate which of them had been called 
“right” and which had been called “wrong.” 
Nuttin found that punished responses ad- 
jacent to the rewarded position were more 
likely to be recalled as rewarded than those 
at positions more remote from the reward 
(spread of recall). This gradient was obtained 
even when the positions of the punished re- 
sponses were scrambled on the test trial while 
the positions of the rewards were held con- 
stant. Recall of responses, on the other hand, 
was not found to vary as a function of dis- 
tance from the rewarded position. A spread 
of recall of failures was also obtained 
around an isolated punishment (Exp. Cé, pp. 
236-238). Nuttin attributed the spread of 
recall to a “vague and general schema” of 
the succession of rewards and punishments 
which influenced subjects’ recalls of afteref- 
fects. He emphasized that the schema is very 
vague and very inaccurate, especially when 
the pattern of the series is complex (p. 238). 
He concluded that it is “possible” that the 
spread of recall “plays a part” in the spread 
of effect (p. 478). 

Greenwald was disposed to be less cautious 
and to go further than Nuttin. In his view 
“Jt is but a short further step to the con- 
clusion that ‘spread of effect’ phenomena may 
be produced as an artifact of these tenden- 
cies.” It is, actually, a considerable leap in 
view of the fact that Nuttin did not measure 
the spread of effect proper in any of his 
experiments and did not establish a relation- 
ship between the spread of recall and the 
spread of effect. There are other artifacts 
which can result in a spread of effect, such 
as guessing sequences. Greenwald appealed 
to the latter in discounting the results of 
Muenzinger and Dove (1937) which Marx 
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(1956) cited as evidence contrary to Nuttin’s 
hypothesis. How much weight should be given 
to one artifact, and how much to the other? 
The necessary facts are not in hand. There 
are, however, reasons to be skeptical about 
the role played by Nuttin’s spread of recall 
in determining the spread of effect. Before 
these reasons are discussed, Nuttin’s evidence 
for the spread of recall will be examined 
briefly. 

In Nuttin’s major study of the recall of 
rewards (Exp. C, pp. 224-232), the “foreign 
word” situation mentioned above was used. 
There were 11 groups of subjects. Three pat- 
terns of rewards and punishments were used, 
with the numbers of punished responses at 
different distances from the reward varying 
widely. It should be noted especially that 
each pattern included cases in which a single 
“wrong” was interspersed between the two 
“rights” (RWR). Nuttin reported that the 
percentage of erroneous recalls of rewards 
was 27 one step away from the actual reward, 
and 22.1 at more distant positions; this dif- 


ference was significant. A closer examination — 


of the data shows, however, that this result 


is due almost entirely to the high percentage — 


of errors for punished items in the RWR 
sequences. When these cases are excluded, 
the means of the percentages of errors of the 
11 subgroups are 24.4, 24.2, and 23.3 for the 
first, second, and third positions away from 
reward, respectively. By contrast, the per- 
centage for the punished responses in RWR 
sequences is 33.8 (Table 25, P- 229). Note 
that in the large majority of experiments on 
the spread of effect following Thorndike’s, 
RWR sequences were excluded. The largest 
drop in the repetition of punished responses 
typically occurred between the first an 
second positions following reward. As far e 
Nuttin’s data go, therefore, there is no sprea 
of recall which is applicable to the sea 
of effect observed in these experiments. u 
reexamination of the data is instructive 12 
view of the strong conclusions which Green- 
wald based on them, The point need ao 
however, be pressed unduly since a significap® 
gradient in the recall of rewards was wi 
tained in other experiments in which 
sequences were excluded and rewards were 
widely separated in the series (Postman 


Adams, 1954, 1955). The latter procedure 
ppears to be essential, therefore, for the 
emonstration of a spread of recall for 
sequences other than RWR. 

It was noted that there are reasons, none 
‘of them perhaps decisive in itself, to doubt 
that the spread of effect will be shown to 
depend on the spread of recall: (a) Although 


gested that the gradient is symmetrical. In 
experiments on the spread of effect, on the 
other hand, significant fore-gradients were 
arely obtained, whereas a reliable after- 
gradient has been a typical finding. The loci 
of the two types of spread may, therefore, 
not coincide. (b) As stated earlier, the limited 
analyses which are available do not point to 
a dependable relationship between recall of 
responses and aftereffects on the one hand, 
and repetition on the other. (c) A recent 
finding by Alfert (1963) suggests that the 
spread of effect proper may not be seriously 
influenced by the kind of vague cognitive 
schema that Nuttin has described. In one 
‘of Alfert’s conditions, the subjects’ responses 
on the test trial were rewarded and punished 
in the same positions as on the acquisition 
ial; in another condition, new items were 
‘substituted for the rewarded ones and the 
esponses in these key positions called 
‘wrong? by the experimenter. The after- 
gradients of repetition of punished responses 
were identical although the structure of the 
‘series was maintained in one case and drasti- 
cally changed in the other. (4) While Nuttin 
found substantial spread of recall around iso- 
lated punished responses, the evidence for a 
corresponding gradient of variability in the 
‘repetition of responses is far from clear 
(Marx, 1956, pp. 167-169). All these con- 
siderations argue for extreme caution in the 
extrapolation of Nuttin’s results in the ab- 
sence of direct tests of the relationship 
between the spread of recall and the spread 
of effect. 
One other point raised by Greenwald de- 
7 serves comment. For reasons which remain 
mclear, Greenwald took exception to Post- 
ms criticism of Zirkle’s (1946) interpre- 
tation of the spread of effect in terms of 
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perceptual isolation, It had been pointed out 
(Postman, 1962) that Zirkle’s theory leads 
to the highly implausible conclusion that 
proximity to an isolated position leads at the 
same time to better retention of responses 
and to a systematic distortion of the memory 
for aftereffects. In objecting to this com- 
ment, Greenwald irrelevantly referred to the 
evidence for spread of recall (distortion of 
memory for aftereffects) found by both 
Nuttin, and Postman and Adams. The occur- 
rence of a spread of recall was, of course, 
not at issue, and Greenwald missed the point 
of the criticism directed at Zirkle’s hypothe- 
sis. The crux of the argument is that, in 
Zirkle’s view, the isolation of rewards (and 
of stimuli) leads to the isolation of the re- 
sponses to which they belong. Thus, isolation 
is assumed to spread within units (S-R- 
reinforcement sequences) as well as between 
units, Isolation implies differentiation from 
surrounding events and favors retention. 
Superior recall should, therefore, be predicted 
for both the response and aftereffect com- 
ponents of an isolated unit. The same logic 
applies to both isolated units proper and 
those which acquire isolation through spread, 
Within the framework of the isolation hy- 
pothesis it becomes an internal contradiction, 
therefore, to predict superior retention of a 
given response and at the same time distor- 
tion of memory for the aftereffects of that re- 
sponse. Greenwald’s objection is off the mark 
because he considered part of a conclusion 
out of its theoretical context. 


Conclusion 


Greenwald has presented an overdrawn pic- 
ture of the implications of Nuttin’s work for 
the status of the law of effect. Nuttin did not, 
in fact, carry out any direct tests of the law 
of effect. This judgment is not intended as 
a comment one way or the other about the 
validity of the law or about the possible rele- 
vance of Nuttin’s findings to other questions. 
Tf and when Nuttin’s hypotheses are tested 
under conditions which bear directly on the 
law of effect, the picture may change. In the 
meantime, Greenwald should recognize that 
the status accorded to a theoretical position 
must depend on the relevance of the evidence 


offered in its support. 
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